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ABSTRACT 


Effective decision-making is a hallmark of military leadership, and 
development of decision makers is critical to military strategy. The Cognitive 
Alignment with Performance-Targeted Training Intervention Model (CAPTTIM) 
was developed to aid training of optimal decision-making. Cognitive state 
suggests a subject is exploring the decision environment as opposed to 
exploiting it, and decision performance classifies whether a subject is making 
optimal decisions. Using a color-coded structure combining cognitive state and 
decision performance, CAPTTIM indicates whether those factors are aligned for 
optimal decision-making—exploiting the environment and making optimal 
decisions—or not. The focus of this thesis was to identify each subject’s 
CAPTTIM status in real time and, when decision performance was misaligned, 
provide feedback to influence the subject’s future decisions. 

Through a human-subject experiment (n = 34), we classified decision¬ 
makers’ CAPTTIM status in real time. We randomly assigned 17 subjects to 
receive tailored feedback during execution of a decision task (feedback group), 
and trend analysis reveals the feedback group to be more likely to reach optimal 
decisions than a control group. 

These results imply that training systems could be tailored to the individual 
and that methods used to instruct effective decision-making may expand to 
include real-time understanding and intervention. 
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I. INTRODUCTION 


A. BACKGROUND 

Military leaders will affirm that, of the myriad critical tasks required of 
military personnel, decision-making is a crucial skill. For example, decisions 
made by junior officers and enlisted service members often have life-or-death 
consequences, and the outcomes of those decisions can have strategic 
implications capable of impacting military and government courses of action well 
beyond a particular moment of action. Thus, the need to understand how 
effective decisions are made is critical to the continued success of our armed 
forces. Military leadership recognizes the importance of agile, adaptive thinkers. 
The U.S. Army and U.S. Marine Corps have each issued strategic guidance 
initiatives directing efforts to improve decision-making. The Army’s Human 
Dimension Strategy 2015 directs the Service to “improve the decision-making 
ability and ethical conduct of Soldiers and Army Civilians through individual and 
collective learning programs that challenge Army Professionals in complex 
operational and ethical situations” (Odierno & McHugh, 2015, p. 7). Similarly, 
Marine Corps Science and Technology Objective (Training and Education) -1 
states that the Corps aims to “develop capabilities to enhance cognitive, 
relational, and perceptual skills for small unit leaders to make effective decisions 
in complex environments; enhancements include attention control, expertise, 
metacognitive skills, and accelerated learning outcomes” (U.S. Marine Corps, 
2012, p. 34). However, as military experience is hard won—specifically combat 
experience where a leader may ever have only one chance to learn from a 
decision—understanding decision-making in a training and educational 
environment has become the focus of increased study (Bechara, Damasio, 
Tranel & Damasio, 1997; Critz, 2015; Kennedy, Nesbitt, Alt & Fricker, 2015; 
Nesbitt, Kennedy, Alt, Yang, Fricker, Appleget, Huston, Patton & Whitaker, 
2013). This thesis is one small part of larger efforts striving to understand the 
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decision-making processes, and improve decision-making among service 
members to increase the combat effectiveness of the military. 

Combat always has been complex; however, that complexity increases 
significantly when service members are confronted with challenges beyond basic 
weapons employment, tactics, and lower-level strategy. History is rife with 
leaders using measures of performance, such as enemy attrition, to draw 
conclusions about the effectiveness of their operations; and discovering too late 
that the information being used to drive decisions was not pertinent to the long¬ 
term outcome of the conflict. Modern warfighters are routinely confronted with 
complex battlefield situations involving noncombatants, irregular threats, 
humanitarian crises and even governance. While not every decision can be 
perfect, and military leaders will rarely have perfect information on which to base 
their decisions, it is important that warfighters possess the cognitive flexibility to 
recognize a changing situation and use the experience gained to adjust the 
decision-making process. If we, as military leaders, better understand decision 
performance and an optimal decision making process, we can train the next 
generation of leaders to make the best possible decision their environment 
presents. 

1. Cognitive Abilities Needed to Achieve Optimal Military 
Decision-Making 

Reinforcement learning—the ability to learn from trial and error—is a 
cognitive characteristic necessary for individuals to achieve optimal decision¬ 
making (Sutton & Barto, 1998). Decisions in the military environment often 
involve a degree of uncertainty. When intelligence estimates of an enemy 
location or the strength of an enemy force are not well established, a military 
professional is still faced with a decision of how (or whether) to act against the 
enemy, for action is surely still required. Thus, the action relies upon the decision 
maker’s accumulated experience and the reinforcement learning that has been 
accrued through the experience, whether those decisions and learning were 
optimal or not One existing evaluation of reinforcement learning is the Iowa 
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Gambling Task (IGT) (Bechara, Damasio, Damasio & Anderson, 1994). The IGT 
has been widely applied and documented in numerous psychology studies 
(Krain, Wilson, Arbuckle, Castellanos & Milham, 2006) and will be discussed 
further here as it serves as the basis of a military-themed reinforcement learning, 
cognitive analysis tool. 

A second characteristic of optimal decision-making is cognitive flexibility. 
As we expect our military decision makers to learn from experience, we assume 
that the learning is incorporated into future decision-making and that existing 
problem solving strategies are adapted based upon the information being 
provided. That is, when a situation, or information within the problem space, 
changes “an individual needs to realize that the situation has changed in order to 
be able to ‘log out’ of the automatic processing mode and come into the 
controlled processing mode” (Canas, Quesada, Antoli & Fajardo, 2003, p. 484). 
This ability to enter the controlled processing mode is cognitive flexibility. In this 
thesis, we hope to influence the decision makers while they complete a military 
version of the IGT to bring them into this controlled processing mode, and then 
determine whether this cognitive flexibility can be leveraged toward optimal 
decisions. 

2. Current Military Decision-Making Instruction 

The current operational environment offers increased opportunity to 
understand decision-making and develop programs to more effectively train this 
critical skill. After many years of combat operations, long deployments in complex 
environments, and dynamic, difficult decision-making, the military has a unique 
opportunity to use the experience gained to understand how this population of 
experienced decision makers functions; toward understanding factors such as 
their cognitive state during the decision making process. For example, when do 
experienced decision makers feel that they need to learn more about the 
environment and when do they feel that they know the environment well enough 
to make optimal decisions? This opportunity may allow less-experienced 
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personnel, and their instructors to understand cognitive state and thus leverage 
situationally dependent information to make optimal decisions. Furthermore, 
those agencies tasked with educating on and instructing for decision-making can 
tailor instruction to the individual decision-maker. The Basic School (TBS) is the 
U.S. Marine Corps’ entry-level training and education venue for newly 
commissioned officers. Every Marine officer—whether future armor officer, 
aviator, infantry officer, lawyer or logistician—spends six months at this school 
being educated and evaluated on tactics and leadership, of which decision¬ 
making is a key facet. TBS is just one example of an institution that applies 
significant effort to ensure junior officers have an appreciation for how to make 
effective decisions. As a former instructor at TBS, the author can confirm that the 
current method of evaluating the effectiveness of the student’s decisions relies 
upon subject matter expertise and direct evaluation of the trainee. Direct 
observation, with little appreciation for the trainee’s cognitive state or decision¬ 
making history leaves much to chance when trying to train and educate the 
military’s future key decision makers. As we define it, the cognitive state of a 
subject, or trainee, will indicate whether he or she is exploring or exploiting the 
decision environment; that is, whether the decision maker believes they have all 
the information required to make optimal decisions. Thus, understanding the 
trainee’s cognitive state may help to produce exercises that will effectively 
instruct on the art and science of decision-making. The focus of this thesis was to 
explore whether a trainee’s cognitive state and decisions can be effectively 
influenced, in real time, toward the optimal set of decisions. 

B. DECISION MAKING 

As stated in previous work, “current reinforcement-learning tests, which 
are typically computerized laboratory tests, do not account for the stress, 
uncertainty, and high-risk conditions of decisions made in combat” (Nesbitt et. al, 
2013, p. 3). We will explore an established psychological decision-making test, 
its modification to a more military-relevant decision task, and the categorization 

of decision-maker cognitive state and decision performance scores into a single 
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color-coded categorization in the Cognitive Alignment with Performance Targeted 
Training Intervention tool. 

1. Iowa Gambling Task 

The IGT is an established psychological test in which subjects make a 
series of decisions and the effect of reinforcement learning can be studied based 
upon the patterns of decisions observed (Bechara et al., 1994). Subjects are 
presented with a computer screen on which four decks of cards are displayed 
face down, and are told to choose cards to optimize their long-term gain. (See 
Figure 1.) 



Select deck by touching 
B C 


Choice: B 

Reward: 100 

Penalty: -0 

Net Gain: 100 


Total: $1450 


-1000 


1000 


2000 


3000 


4000 


5000 


Figure 1. The Iowa Gambling Task Screenshot. Source: Sacchi (2015) 

The subject begins the trial with a loan of an imaginary $2000. Each card 
selected results in some amount of gain and some amount of loss such that, over 
time, the subjects can conjecture the net gain or net loss after multiple selections 
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and careful observation of gain/loss patterns. As success is defined as ending a 
set number of trials (usually 100 - 200 individual selections) with the most money 
possible, “participants can succeed on the IGT only when they learn to forgo high 
immediate rewards and prefer the safe options over the risky options” 
(Steingroever and Wetzels, Horstmann, Neumann & Wagenmakers, 2013, p. 
180). What is initially unknown to the subject is that the payouts are 
predetermined, and further, certain decks will always provide a higher long-term 
payout than others. Ultimately, the subject is meant to recognize that decks A 
and B are long-term losers; although in the first few selections these decks 
reward the subject, decks A and B are heavily penalized later resulting in a net 
loss over 10 or 15 selections. Previous studies have concluded, “subjects must 
rely on their ability to develop an estimate of which decks are risky and which are 
profitable in the long run” (Bechara et al. 1994, p. 13). Eventually, a subject 
should realize that despite smaller payouts-per-trial from decks C and D the long¬ 
term payout is greater. 

2. Convoy Task 

We will build on the foundation of the IGT; past work at the Naval 
Postgraduate School (NPS) has converted the same decision-making evaluation 
approach to a military-relevant decision making tool called the Convoy Task. 
(See Figure 2). 
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Select route for next convoy. 


Accumulated Damage: 2500 



50 


-250 


Damage to Enemy Forces 


Damage to Friendly Forces 


The decision just executed by this subject has resulted in a gain of 50 damage 
points (Damage to Enemy Forces) and a loss of 250 damage points (Damage to 
Friendly Forces) for a net change to Accumulated Damage of-200 points. 

Figure 2. Convoy Task Screen. 

The creators of the Convoy Task state that “this new task focuses on high 
stakes and uncertain environments particular to military decision making 
condition and retains essential characteristics of the foundational task and gives 
insight into reinforcement learning of military decision makers” (Nesbitt et. al, 
2013, p. 10). As opposed to a monetary reward and penalty system, the creators 
used a more military-relevant scoring system; damage to enemy and friendly 
forces. Damage to Enemy Forces is the reward and adds to the running score, 
termed Accumulated Damage, which stands in for the $2,000 loan amount in the 
IGT. The penalty is termed Damage to Friendly Forces, and it subtracts from 
Accumulated Damage (Nesbitt et al., 2013). And rather than identical decks of 
cards, subjects are presented with four identical photos of a non-descript road 
that might depict a convoy route. Past data collected from 34 subjects confirmed 
that the Convoy Task requires reinforcement learning to effectively add to the 
total Accumulated Damage score (Kennedy et al., 2015). 
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3. Cognitive Aiignment with Performance Targeted Training 
Intervention 

Efforts at NPS by Kennedy et al. (2015) resulted in a model called 
Cognitive Alignment with Performance Targeted Training Intervention Model 
(CAPTTIM) that places subjects into one of four color-coded categories based 
upon cognitive state and decision performance. This model distinguishes 
between two subject cognitive states: exploration (feeling that one has not 
figured out the task and needs to explore the environment more) and exploitation 
(where a subject thinks that they have mastered the task and is acting upon 
acquired knowledge). The model then determines whether cognitive state is 
aligned or misaligned with observed decision performance. (See Figure 3). 
CAPTTIM utilizes simple behavioral measures to characterize cognitive state and 
decision performance. It uses variability in latency from decision to decision to 
determine whether the trainee’s cognitive state is exploration (large latency 
variability) or exploitation (small latency variability). Decision performance is 
measured by regret, the difference between the trainee’s decision and the 
optimal decision, given perfect knowledge of the task. High regret indicates poor 
decision performance; low regret indicates near optimal decision performance. 
Thus, accumulated regret provides a measure of how far off the trainee is from 
the optimal decision path. In the NPS master’s thesis from 2015, Critz 
established the threshold delineating between high and low regret of each 
decision during the same decision-making task and concluded, “by looking at a 
common reinforcement learning task, modified for the military domain the thesis 
team was able to investigate and better understand a subject’s decision-making 
pattern” (Critz, 2015, p. 50). 
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Cognitive State 


Exploration 


Exploitation 


Decision 

Performance 


Training interval 
Low required. 

Regret 



Seeking information, and d 
performance is not optimal 


Remaining in i 
yellow cell for 
long can be 
concern. 



Seeking information, yet, decision 
performance is optimal. 


Acting upon acquired knowledge, 
and decision performance is 
optimal. 


Figure 3. CAPTTIM Categories and Corresponding Cognitive State and 
Regret Information. (Source: Kennedy et al., 2015) 

The CAPTTIM model has shown results that suggest we are able to (1) 
accurately classify a subject’s cognitive state and decision performance at the 
trial-by-trial level and (2) determine which subjects made the transition to the 
optimal decision path (Subject 14) and which subjects would benefit from 
individualized feedback (Subjects 11 and 33). (See Figures 4 through 6). 
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Subject 14 shows the ideal transition from exploration to optimal decision¬ 
making. Note: Yellow, orange, red, and green indicate CAPTTIM categorization 
for a given trial. Blue vertical spikes represent trials in which subjects received 
strong negative feedback. 


Figure 4. Critz (2015) Subject 14 CAPTTIM Categorization of Decision 
Behavior at the Trial-by-Trial Level. (Source: Critz, 2015) 
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Subject 11 CAPTTIM 



Trial 


Subject 11 never quite figured out the task. Note: Yellow, orange, red, and green 
indicate CAPTTIM categorization for a given trial. Blue vertical spikes represent 
trials in which subjects received strong negative feedback. 


Figure 5. Critz (2015) Subject 11 CAPTTIM Categorization of Decision 
Behavior at the Trial-by-Trial Level. (Source: Critz, 2015) 
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Subject 33 CAPTTIM 



O 50 100 150 200 


Trial 


Subject 33 consistently exploited poor choices despite receiving strong negative 
feedback. Note: Yellow, orange, red, and green indicate CAPTTIM categorization 
for a given trial. Blue vertical spikes represent trials in which subjects received 
strong negative feedback. (Critz, 2015) 

Figure 6. Critz (2015) Subject 33 CAPTTIM Categorization of Decision 
Behavior at the Trial-by-Trial Level. (Source: Critz, 2015) 
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As this thesis aims to demonstrate that decision performance can be 
improved using tailored messages when a subject’s cognitive state is misaligned 
with decision performance, we must explore how to effectively communicate the 
need for a change in decision-making strategy; we need to be able to 
immediately and effectively communicate to a subject that their decision-making 
pattern is not optimal. That is, how and what do we communicate to a Subject 33 
(depicted in Figure 6) that will cause decision performance to transition to optimal 
decisions such as portrayed by Subject 14 (depicted in Figure 4)? 

C. DECISION-MAKING TRAINING INTERVENTION 

Much of the challenge of the current CAPTTIM-based thesis was to 
convert the retrospective analysis of cognitive state and decision performance 
contained within the CAPTTIM model - and depicted in Figure 3 above - to a 
near real-time system. The current effort would only be fruitful when the real-time 
recognition of cognitive state and decision performance could be used to alert a 
subject to suboptimal performance and attempt to influence the decision-making 
strategy toward a preferred end state. Thus, one aspect of this thesis involved 
determining the type of feedback to give to subjects. 

The type of feedback to give to subjects was guided by studying literature 
on other experience-based learners; i.e., language acquisition students. Most 
evident in the literature related to ‘feedback to students’ and/or ‘intervention in 
education/training’ was techniques used by second-language teachers and 
learners. In Corrective Feedback and Learner Uptake, the authors study when, 
how, and which learner’s errors should be corrected (Lyster & Ranta, 1997). This 
information is pertinent to when, how and which subject’s cognitive 
misalignments should be corrected or guided in our experiments. Of the six types 
of feedback discussed in a literature review, we surmise that, while effective, 
explicit correction could result in the decisions of our subjects being influenced 
too firmly toward the desired decision path. 
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We are studying whether a subject can learn through experience during 
execution of a task, not whether they understand their own particular reasoning 
behind the change in strategy. In our subjects, we are seeking self-repair, which 
“refers to a self-correction in response to the feedback when the latter does not 
already provide the correct form” (Lyster & Ranta, 1997, p. 50). We do not want 
to hand the subject the answer but rather guide their experience-based learning 
based upon our evaluation of their CAPTTIM classification (Red, Orange, Yellow, 
Green). Therefore, we crafted our feedback messages to subjects to be in the 
form of metalinguistic feedback, which “contains either comments, information, or 
questions related to the subject’s response, without explicitly providing the right 
answer” (Lyster & Ranta, 1997). The specific guidance offered to subjects based 
on their current CAPTTIM categorization will be detailed below. 

D. THESIS MOTIVATION 

Decision-making is what leaders do. As the decisions of military leaders 
become more and more complex, and have the potential for greater and greater 
impacts, it is imperative that we understand the process of decision-making and 
attempt to build training systems and techniques that develop leaders who 
understand how to tend toward optimal decisions. We want to evolve past using 
a single instructor’s best guess at whether a single trainee is making optimal 
decisions. This thesis extends upon past study on decision-making at NPS to 
attempt to capture the decision-maker’s cognitive state in real time, and further, 
influence sub-optimal decisions. 
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II. METHODS 


The NPS Institutional Review Board approved our study to test whether 
CAPTTIM-oriented feedback could aid optimal decision-making; several 
methodological steps were completed to arrive at the final experiment. This 
section will first illustrate how previous work used a retrospective approach to 
identify a subject’s cognitive state as exploring or exploiting the decision-making 
environment and to classify decisions as optimal or suboptimal decisions through 
the use of a quantitative metric of decision performance called regret. This 
previous work also showed that those two factors (cognitive state and decision 
performance) could be retrospectively combined to represent a subject’s 
placement in CAPTTIM. Next, the methodological steps used in the current work 
to apply the CAPTTIM categorization in real time will be discussed. The final 
methodological steps were to use real time CAPTTIM categorization to provide 
timely and targeted feedback to subjects as they complete the Convoy Task. The 
Python executable code for the modified Convoy Task (with and without 
feedback windows) is available in Appendix B. 

A. PREVIOUS WORK IN DEFINING COGNITIVE STATE AND REGRET 

Previous work in decision-making at NPS has used the two factors of 
‘cognitive state’ and ‘decision performance’ to classify the subject into one of four 
CAPTTIM, color-coded categories (Kennedy et al., 2015; Critz, 2015). The 
evolution of these factors and the demonstration that they can accurately 
categorize whether or not a subject’s cognitive state is aligned or misaligned with 
observed decision performance will be used as foundation for application to real¬ 
time analysis of optimal decision-making. 

1. Cognitive State: Expioration and Expioitation 

Nesbitt et al. (2013) classified a subject’s cognitive state by utilizing an 
exponentially weighted moving average (EWMA) of the latency between 
decision-making times. 
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An individual EWMA value is calculated as: 

Z = XXi + (1 - A)Z -1 

where Z\ is the EWMA control statistic, A is the weighted parameter, and Xi is the 
actual observed data value. The time between decisions was captured based on 
when a subject clicked on a route; using the computer’s clock time we calculated 
the latency between clicks. Kennedy et al. (2015), showed that latency times 
would be exceptionally long after the subject experiences high damage, and that 
decision times after low damage would be relatively low. In order to determine 
whether a subject’s latency time on a given trial was exceptionally long, a 
baseline latency time was established for each subject. Because previous work 
was completed retrospectively, all 200 trials were used to define the baseline as 
consisting of those latency times in which the subject received no to minimal 
friendly damage on the previous trial. Exploration thus was defined as a set of 
trials wherein the deviation between latency times was 2 SD or more greater than 
the baseline. Exploitation was defined as occurring on all other trials, i.e., trials in 
which the deviation between latency times was less than 2 SD above the 
baseline. 

2. Regret as a Measure of Decision Performance 

We use regret as the decision performance input to the real-time 
CAPTTIM category placement. Regret is the difference, in points, between an 
optimal decision and the subject’s decision. Kennedy et al. (2015), Nesbitt, 
Kennedy, Alt & Pricker (2015) and, Critz (2015) all expanded from the IGT-based 
definition of regret in order to allow for more specificity in classifying users by 
CAPTTIM state. Because we know the payout of each route before the 
experiment begins, we also know, for any given trial, which route provides the 
best payout. Thus, regret can be calculated as the difference between the 
optimal score for a given trial and the score achieved by the subject’s decision on 
that trial (Nesbitt et al., 2013). 
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Previous thesis work at NPS determined that the best method to delineate 
between a subjects’ high or low regret is to compare the “process mean for a 
window of trials with the median of the process to determine whether it fell above 
or below the median. If the process mean was above the median, the subject 
was categorized as having high regret; if the process mean was below the 
median, the subject was categorized as having low regret” (Critz, 2015, p. 33). 
This information was derived by use of a statistical software program called R- 
studio and the use of built-in change point analysis tools. 

Change point analysis is a method for determining whether a change has 
taken place in a set of values over time, and specifically upon which event or 
time that change happened. Software tools take a large set of data (whether non¬ 
normal distributions, ill-behaved, or data with outliers) and determine when 
significant changes occurred by noting a sudden change in direction of the 
cumulative sum (Taylor, 2000). Further, examining previous work and the 
establishment of the EWMA window we find that “the R package utilized in this 
analysis was the segment neighborhood algorithm which utilizes dynamic 
programming to calculate the optimal segmentation for m + 1 change points and 
reuses the data calculated for m change points” (Critz, 2015, p. 25). The 
algorithm examines an entire set and identifies where the set can be segmented 
to illustrate significant changes in value. As a subject’s regret may change by 
many points at every decision this resulted in too many change points. Therefore, 
Critz (2015) specified a smaller number of changes (15) that still identified the 
subject’s regret but did not display erratic, unreadable data. 

B. PILOT TESTING 

The modified Convoy Task code (Appendix B) was pilot tested to ensure it 
accurately reflected the foundational work creating and validating the CAPPTIM 
model (Nesbitt et al., 2013; Critz, 2015; Kennedy et al., 2015; Nesbitt et al., 
2015). Pilot testing was conducted over two weeks and in two separate sessions. 
We used members of the thesis team who, while familiar with the overall 
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construct of the experiment, did not have intimate knowledge of the modifications 
to the Convoy Task code and thus were able to provide usable feedback and 
data. We will highlight issues and resolutions of each pilot test period. 

1. Pilot Testing: Correcting Modified Code for Cognitive State 

Initial piloting runs exposed problems with converting the retrospective 
analysis of previous work to the real-time CAPTTIM assignment required of the 
hypothesis of this thesis. This piloting revealed the requirement to address 
discrepancies in the computation of a subject’s baseline cognitive state. Recall 
that the explore/exploit cognitive state is assigned based upon the exponentially 
weighted moving average (EWMA) of the standard deviations of latency per trial 
as compared to the subject’s baseline latency and SD thereof. Initial code 
modification of the Convoy Task stored the raw time between each decision as 
‘latency’ and then queried this list to determine if the most recent decision was 
faster or slower than the overall average decision time. This method neglected 
two important contributing factors to properly computing a subject’s cognitive 
state. First, because cognitive state characterization is based on variability in the 
SD of latencies, we need to establish the SD of a subject’s baseline latency time 
and compare to that. Upon modification to the code, we used the first 50 trials of 
the Convoy Task to capture these baseline latency times and the SD associated 
with them. Latency times from only those decisions that did not result in 
exceptionally high Friendly Damage are stored and processed as the baseline. 
Second, as opposed to comparing the single most recent latency time to the 
baseline (or the overall average), the program is required to compare the 
standard deviation of the ten most recent trials to the standard deviation of the 
50—good-decision—baseline trials. A threshold was applied to the SD of the 
EWMA of latency times in order to delineate between the cognitive states of 
exploration and exploitation. Based on extensive pilot testing, we calculated the 
SD of decision times and assigned to ‘explore’ or ‘exploit’ depending on whether 
that SD was above or below 1.5 times the standard deviation of baseline latency 

times. The actual number associated with the ‘explore’ or ‘exploit’ cognitive state 
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was specific to each subject as the comparison was being made to his or her 
individual baseline. This extension of the foundational work of Nesbitt et al. (2013 
and 2015) and Critz (2015) successfully incorporated the EWMA methodology to 
properly compare the standard deviations and determine whether the individual 
subject is making abnormally slower decisions than he or she normally would; in 
that case, for example, indicating an ‘exploration’ state. 

2. Pilot Testing: Correcting Modified Code for Decision 
Performance 

Pilot testing also allowed the team to discover aspects of the Python code 
used by Critz (2015) that did not directly translate to classifying a subject’s 
decision performance (regret) in real time. Critz (2015) determined that a window 
of 15 trials worked well for retrospective categorization of regret. As current work 
did not require smooth transition curves to illustrate reinforcement learning, we 
chose to modify this window in the real-time analysis of regret. We were able to 
code a simple algorithm measuring subject performance and categorize regret 
according to the accepted EWMA model using only the previous 10 decisions. 
This modification allowed for more opportunities to observe variability in subject 
decision performance and (if subject is a member of the feedback group) to 
influence future decisions toward the optimal by displaying a message to guide 
decision-making strategy. We did maintain the same general model where high 
regret is defined as when the process mean for a certain number of trials is 
above the median for those same trials. However, we used a window of the last 
10 decisions rather than 15 trials. 

Further, while the code was originally written using the concept of ‘gain’ 
from the Iowa Gambling Task, the concept of regret—and its use as one variable 
of CAPTTIM classification—needed to be recognized and adapted as the 
opposite of gain in order to properly define whether the regret was “high” or “low.” 
Initially we did not recognize that regret as we use it is the opposite of gain 
previously encoded in the Convoy Task, and thus we found that subjects’ 
cognitive state was incorrectly assigned. Interestingly the regret assigned to a 
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subject—high or low—was not exactly opposite of intended, which would have 
been the first assumption if ‘regret = ‘-gain’. Rather the inequalities used and the 
combination of the sliding window of trials resulted in unpredictable behavior but 
clearly improper assignment of CAPTTIM category. Once the pilot test revealed 
the incorrect assignment of CAPTTIM category, it was relatively simple to 
backtrack through the data and code by hand to realize that the inequalities in 
the code were reversed. This correction was made allowing us to use editable 
lists in the code to append the regret-per-trial value (called damage as per 
Nesbitt et al., 2015) and analyze the regret value of the previous 10 trials. The 
comparison of the median of the last 10 trials to the average as discussed above 
was relatively straightforward and all four categories of CAPTTIM (Red, Yellow, 
Crange, Green) were properly assigned to subjects during final pilot testing and 
on into experimentation. 

Finally, an additional change we made from the original Convoy Task 
code with regard to regret was the automatic ‘red’ CAPTTIM categorization of 
those subjects who incurred extreme friendly damage after trial 100. Critz (2015) 
automatically assigned ‘high’ regret to subjects who incurred a ‘bad’ route after 
trial 100. We did not incorporate this classification into the Convoy Task, as it 
was our goal to show an ability to influence decision makers regardless of trial 
number. If we automatically placed subjects into a high regret state, we may 
have ended up displaying an improper message to a subject in the feedback 
group when another message may have been more appropriate given the regret 
state based purely on the mean/median comparison detailed above. 

3. Pilot Testing: Capturing Data Outside of Established Change 
Points 

As pilot testing continued, we realized that we did not have enough data 
during each subject’s run to confirm or reject the hypotheses regarding the 
proportion of trials in the green/red CAPTTIM classification. As mentioned above, 
Critz (2015) used a window of 15 trials in the change point analysis to determine 
when CAPTTIM classification occurs or changes. Thus, initially our Python code 
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only captured the CAPTTIM classification every 15th trial after the baseline 50. 
This approach was acceptable to allow feedback to be issued to a subject in 
hopes of optimizing future decisions, but to analyze proportions after the 
completion of the experiment it was necessary to capture the cognitive state and 
regret data at each trial. The modification to the code was relatively minor (and is 
reflected in the final code used in the experiment as per Appendix B) but the 
correction to the design of the experiment was significant and allowed the team 
to move forward into experimentation confidently assured that enough data 
would be collected to compare between a control (no feedback) group and 
experimental (feedback) group. 

4. Summary of Pilot Testing Changes 

Overall, pilot testing illustrated four key changes to ensure the program 
used for this thesis captured and processed information effectively and 
categorized subjects into the validated CAPTTIM: 

Incorporating the EWMA to analyze the SD of decision times and capture 
cognitive state vice simply comparing the latency times to a subject’s average 
decision time. 

1. Calculating the baseline latency time from the first 50 trials rather 
than retrospectively over the entire set of trials. Thus, in this study, 
the convoy task had 250 trials - the first 50 to acquire the baseline 
latency time and the remaining 200 for CAPTTIM assignment. 

2. Correcting for an inaccurate assignment of high/low regret based 
on the subject’s point gain as originally coded in Convoy Task; 
given the real-time nature of the data-capture in this thesis, the 
regret is captured in the same sliding window comparing the 
average of the last ten decisions to the median of the last ten, but 
we had initially not recognized the need to invert the properties for 
correct CAPTTIM assignment. 

3. We discovered the need to capture the CAPTTIM category for each 
subject, on every trial, vice the set number of trials established by 
the change point analysis of Critz (2015). Ensuring the data was 
processed for every trial being one of the main goals of the thesis, 
this change - though relatively simple - was a key change exposed 
in the pilot tests. 
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C. PARTICIPANTS 

All subjects were recruited from the student body of NPS. As such, all 34 
were military officers, spanning all services: 14 U.S. Marine Corps, eight U.S. 
Army, eight U.S. Navy, and four U.S. Air Force. These subjects were randomly 
selected into two groups. There is no difference in demographic characteristics 
between the two groups (all p-values > 0.47). The control group and the 
feedback group both contained 17 subjects, 14 men and 3 women in each. The 
average age of the control group was 34.71 years (SD=3.64), and 32.53 years 
for the feedback group (SD=4.08 years). The control group had slightly more time 
in service: average of 13.47 (SD=4.56) years versus the feedback group’s 10.06 
years (SD=4.13 years). Despite the slight difference in years of service, the 
deployment record of the subjects within each group was the same: 14 members 
of each group had deployed to a combat zone and 3 had not, and the median of 
each group’s members’ return from the imminent danger pay deployment was 
2013. The median rank was 0-3 (lieutenant in the sea services, captain in the 
ground services and air force). 

D. CONVOY TASK. 

As detailed in Nesbitt et al. (2015) subjects saw four identical routes. (See 
Figure 7). Subjects were instructed that, over a pre-set number of trials, they 
choose which route to send convoys. Subjects will add to or subtract from their 
Accumulated Damage score by inflicting Enemy Damage or taking Friendly 
Damage respectively. Subjects were told during instructions that the pictures are 
identical. Their goal was to learn, by the experience of friendly and enemy 
damage at each trial, which routes achieve the maximum Accumulated Damage 
score. 
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Select route for next convoy. 


Accumulated Damage: 2500 



50 


-250 


Damage to Enemy Forces 


Damage to Friendly Forces 


The decision just executed by this subject has resulted in a gain of 50 damage 
points (Damage to Enemy Forces) and a loss of 250 damage points (Damage to 
Friendly Forces) for a net change to Accumulated Damage of-200 points. 

Figure 7. Convoy Task Screen. 

As can be seen in Appendix A, the routes have the same payout as the 
decks of cards in the original IGT (Bechara et al., 1994): routes 3 and 4 are 
considered good; routes 1 and 2 are considered bad. Participants receive 
immediate results of each trial by observing the Damage to Enemy Forces, 
Damage to Friendly Forces and Accumulated Damage score from the current 
decision. 

E. FEEDBACK TO SUBJECTS 

To examine whether messages to subjects can influence future decision¬ 
making toward optimal decisions—and whether there is a significant difference 
between those subjects and a control group that did not receive any feedback 
during execution of the task—we first must determine how to administer the 
feedback. We reviewed literature on feedback to trainees during execution of 
tasks and corrections made to students in second language learning (Archer, 
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2010, Chickering & Gamson, 1987, Lyster & Gamson, 1997) to determine the 
most acceptable method to offer input to subjects about their performance and 
decision making strategy. We had to decide carefully what information to provide 
the subjects to influence decision making without simply providing the exact 
proper strategy to succeed in maximizing score on the Convoy Task. We arrived 
at the messages corresponding to each CAPTTIM color category. (See Table 1). 
Also, a screenshot showing one of these four messages as seen by a subject is 
provided. (See Figures 8 and 9). 

Further, we discussed when and how often to offer feedback. As we have 
already determined that the first 50 trials would be used to establish a subject’s 
baseline latency time (the primary determinate of cognitive state), we continued 
the pattern and began the feedback to subjects after trial 50 and repeating every 
tenth trial. We demonstrate in the Results Section that the CAPTTIM 
categorization is knowable at every trial, allowing the messages in Table 1 to be 
displayed in pop-up windows when desired. Again, the executable Python code 
to view this computerized task is available in Appendix B. 


Table 1. Messages Provided to Subjects in Feedback Group via Pop-up 

Windows 


CAPTTIM Category 

Message to subject in feedback group 

Green (Exploit and low 

regret) 

Score is looking good. Stay with your strategy 

Yellow (Explore and 

high regret) 

Score could be better; attend to friendly damage 

Orange (Explore and 

low regret) 

Score looking good, go ahead and make decisions 

quickly 

Red (Exploit and high 

regret) 

Score could be better, attend to friendly damage 

and try other routes. 


Every 10 decisions after trial number 50 based on CAPTTIM category at that decision. Text 
in parentheses indicates the cognitive state and regret level associated with each CAPTTIM 
category 
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Select route for next convoy. 


Accumulated Damage : 2750 



50 


0 


Damage to Enemy Forces 


Damage to Friendly Forces 


.The strategy executed by this subject has resulted in CAPTTIM categorization of 
‘Green’ and the resultant message of “Score is looking good; stay with your 
strategy” in the pop up window. 

Figure 8. Convoy Task Screen Showing Feedback Pane 
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Each of the four messages is displayed in a pop-up window with the same 
formatting, and requires the subject to click on the ‘OK’ button to continue the 
task. This subject is in the green CAPTTIM category thus is encouraged to stay 
with current decision-making strategy 

Figure 9. A Closer Look at the Convoy Task Feedback Pane. 
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F. 


SURVEYS 


We used surveys before and after the experiment to: 1) gather 
demographic factors that may have been relevant to statistical analysis and 2) 
collect strategies employed, and impressions of the experiment after completion. 

1. Demographic Survey 

The demographic survey included questions regarding branch of service, 
deployment history and general subject information such as age and rank. (See 
(Appendix C). This survey allowed us to verify the active duty military status of 
subjects and ensure results measured between the control and feedback groups 
are not due to other demographic characteristics 

2. Post Task Survey 

The post task survey queried subjects for qualitative input about their 
experience and decision-making strategy during the experiment. It also contained 
questions asking whether subjects changed their approach to decision-making 
during the task and if so, why. (See Appendix D). 


G. PROCEDURES 

This study was approved by NFS’s Institutional Review Board. The overall 
concept of the experiment was to conduct the computerized Convoy Task on a 
single subject during a single visit to the lab. The experiment was designed to 
take less than one hour and was planned to take place during normal working 
hours at a time convenient to the individual volunteer subjects. Recruitment of 
subjects was conducted from among the student population of NFS by publishing 
a written advertisement on the school’s intranet site where each student must 
read announcements once daily. 

Once participants reported to the lab, an explanation of the general 

process was provided and the informed consent procedure was completed. If a 

subject consented to participate in an additional survey collecting data regarding 
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head injuries, they completed The Ohio State University Traumatic Brain Injury 
(TBI) Identification short form. The data collected will not be discussed in this 
thesis as it is beyond the scope; the information was collected as part of a larger 
study. Whether or not a subject chose to participate in the head injury data 
collection, they completed a demographic survey as detailed above. The 
experimenter then randomly assigned subjects into the control group or the 
feedback group. 

Eye-tracking hardware and software were calibrated to each individual to 
allow collection of gaze data during execution of the Convoy Task. The eye 
tracking software automatically generates two files that may be used to examine 
a subject’s gaze point throughout the execution of the task and may then be 
analyzed to determine if there is correlation between designated factors (scores, 
proportion of time in each CAPTTIM category, etc.) and the subject’s attention to 
data displayed on the screen. The eye tracking data was collected for a larger 
project and also will not be discussed here, as it is not within the scope of this 
thesis. 

The experimenter used a script to explain the Convoy Task screen and 
task requirements to each subject in detail. Once the subject affirmed an 
understanding of the screen and the task, eye-track recording was begun and the 
subject was allowed to make decisions, uninterrupted, by using a mouse to click 
on a route. Each subject completed the Convoy Task by making 250 individual 
decisions to maximize a total score. If a subject was assigned to the feedback 
group, he or she received on-screen feedback via standard pop-up windows 
every 10 trials that offered guidance to the subject based upon their CAPTTIM 
categorization. If a subject was assigned to the control group, he or she received 
no on-screen feedback. 

Finally, subjects answered the post task survey and the experimenter was 
available to answer questions about the study, its goals and potential uses of the 
results to develop training systems or techniques. 
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III. RESULTS 


This section discusses the statistical results of the experiment and efforts 
to answer the research questions. In reviewing the results we will discuss 
subject-data preparation overall and whether the experiment was able to 
adequately answer the research questions and hypotheses in detail. The larger 
research questions to be addressed are 1. Whether cognitive state and decision 
performance (regret) data could be captured in real time while the subject 
completed the Convoy Task, and 2. Whether feedback offered to a subject based 
upon cognitive state and regret data would cause the subject to achieve better 
results (i.e., optimal decision making). The latter research question is divided into 
four hypotheses, which will be reviewed in detail and answered individually. 
Statistical methods and a-levels will be explained in conjunction with the specific 
hypotheses to which each applies. 

A. PRELIMINARY ANALYSES 

Preliminary analyses revealed that there was no significant difference in 
demographic characteristics between the two groups detailed in the Participants 
section (all p-values > 0.47). Additionally, there were no significant differences in 
score performance on the Convoy Task by age, gender, military branch of 
service, or deployment history. For these demographic factors we used two- 
sample f-tests with a two-tailed, alpha level of .05 to compare means. When 
comparing for gender we find (f(34)=0.75, p=0.47) and find that mean scores are 
not significantly different by gender. Considering age, we separated the groups 
into old and young based upon the median age of all participants, 34 years. 
Eighteen subjects age 34 and older comprised the old group while the sixteen 
subjects aged 33 and younger comprised the young group. Using the same 
statistical procedure we find (f(34)=1.17, p=0.25), indicating that there is no 
significant difference in score by age. Similarly, when examining years of service 
we divided the groups based upon a more experienced service member (defined 
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as the median—12 years—and greater) compared to a less experienced (11 
years or less); the experienced group comprised of 18 subjects and less 
experienced counts 16 subjects. There is no difference in average scores 
between the two years-of-service groups (f(34)=183, p=.08). Thus, overall, we 
suggest that Convoy Task and CAPTTIM results cannot be explained by any 
potential difference in demographic characteristics between the control and 
feedback groups. 

B. RESEARCH QUESTION 1: REAL-TIME DATA CAPTURE 

In answer to the first research question, we found that the Python code in 
Appendix B was able to reliably capture subjects’ decision-making data (cognitive 
state and decision performance) in real time. 

The two factors are combined between each trial to result in an assigned 
CAPTTIM categorization as of that trial. If a subject is observed to be exploiting the 
environment (again this is when the standard deviation of current decision times is 
less than 1.5 times a subject’s individual baseline standard deviation) but regret is 
high (i.e., not making optimal decisions) the subject’s CAPTTIM categorization is 
red. Exploiting the decision-making environment with low regret earns a subject a 
green categorization. The Exploration cognitive states are similar: with high regret, 
yellow CAPPTIM; with low regret, orange CAPTTIM categorization. This dynamic 
can be concisely depicted by graphic. (See Figure 10.) 
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Exploitation & High Regret = RED, Exploration & High Regret = YELLOW, 
Exploration & Low Regret = ORANGE, Exploitation & Low Regret = GREEN. 

Figure 10. CAPTTIM Categorization States. 

(Source: Kennedy et al., 2015) 


Below is a sample of the data captured for each subject, and 
demonstrates that the desired data can be captured in real time, on a decision- 
by-decision basis and successfully categorizes a subject into the appropriate 
CAPTTIM category. (See Table 2). Note that Table 2 has been edited for space 
and that the selection of trials included are to demonstrate the effective capture 
of all CAPTTIM categories and not necessarily a complete record of the subject’s 
consistent or overall performance. For example, it can be assumed that between 
trials 51 and 61 the subject remained in the red CAPTTIM category, and from 61 
to 79 the subject was in the yellow category continually. But the overall capture of 
data and manipulation to CAPTTIM category on a decision-by-decision basis is 
demonstrated. 
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Table 2. Capture of CAPTTIM Real-time Data from Subject 211. 


trial 

routeSel 

trialGain 

trialLoss 

Damage 

latent 

cogState 

CAPTTIM 

50 

4 

50 

0 

2450 

0.645 



51 

4 

50 

0 

2500 

0.946 

Exploit 

RED 

59 

4 

50 

0 

2650 

0.526 

Exploit 

RED 

60 

4 

50 

0 

2700 

0.546 

Exploit 

RED 

61 

4 

50 

0 

2750 

12.316 

Explore 

YELLOW 

78 

1 

100 

0 

3400 

3.046 

Explore 

YELLOW 

79 

1 

100 

250 

3250 

0.827 

Explore 

YELLOW 

80 

3 

50 

0 

3300 

2.269 

Exploit 

RED 

84 

3 

50 

0 

3400 

1.633 

Exploit 

RED 

85 

3 

50 

50 

3400 

0.927 

Exploit 

RED 

86 

3 

50 

50 

3400 

1.23 

Exploit 

GREEN 

91 

4 

50 

0 

3550 

7.04 

Exploit 

GREEN 

92 

4 

50 

0 

3600 

0.706 

Exploit 

GREEN 

93 

4 

50 

0 

3650 

0.647 

Exploit 

RED 

94 

4 

50 

0 

3700 

0.606 

Exploit 

RED 

95 

4 

50 

0 

3750 

0.566 

Exploit 

RED 


Displayed in the table from left to right are the data points captured on each decision: The 
trial number (a count of the decisions which a subject has made), the route selected 
(numbered 1 - 4 from left to right as viewed on the experiment screen), the trial gain, a point 
value that the decision gained for the subject, the trial loss, a point value the subject lost for 
each decision (these values result in a net gain - can be positive or negative - for each 
decision), the running Damage score (a result of all of the previous net gains, which began 
as a value of 2000), the latent time between each decision in seconds, the ‘explore’ or 
‘exploit’ cognitive state and the CAPTTIM color categorization. 
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We can also represent the percent of trials each subject in the control 
group spent in each CAPTTIM categorization (See Table 3). Also shown are the 
overall percent of time all subjects were in each of the four CAPTTIM categories. 


Table 3. Control Group Subjects’ CAPTTIM Breakdown for the Duration 

of the Experiment. 


CONTROL GROUP 

SUBJECT 

GREEN 

YELLOW 

ORANGE 

RED 


Percent of trials in each color 

110 

25 

0 

0 

75 

111 

4.5 

0 

0 

95.5 

112 

6.5 

0 

0 

93.5 

113 

13 

0 

0 

87 

114 

7.5 

1.5 

0 

91 

115 

1 

24.5 

0.5 

74 

116 

51 

0 

0 

49 

117 

100 

0 

0 

0 

118 

5.5 

5 

1.5 

88 

119 

11.5 

0 

0 

88.5 

120 

10 

0 

0 

90 

121 

7 

0 

0 

93 

122 

20 

0 

0 

80 

123 

4 

0 

0 

96 

124 

14 

0 

0 

86 

125 

15.5 

0 

0 

84.5 

126 

2 

0 

0 

98 

TOTAL (MEAN) 

17.52941176 

1.823529412 

0.117647059 

80.52941176 

TOTAL (SD) 

24.3140074 

5.973747715 

0.376223494 

23.78449507 


Category values are percentages are as percent of total number (250) of decisions. Also 
depicted (In bold at bottom of table) Is the total percentage of decisions the group spent In 
the corresponding CAPTTIM color category 
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Table 4 represents the percent of trials each subject in the feedback group 
spent each CAPTTIM categorization. Also shown are the overall percent of time 
all subjects were in each of the 4 CAPTTIM categories. 


Table 4. Feedback Group Subjects’ CAPTTIM Breakdown for the 

Duration of the Experiment. 


FEEI 

DBACK GROUP 

SUBJECT 

GREEN 

YELLOW 

ORANGE 

RED 


Percent of trials in each color 

210 

7.5 

0 

0 

92.5 

211 

52.5 

9 

0 

38.5 

212 

90 

4.5 

1.5 

4 

213 

66 

0 

0 

34 

214 

8 

4.5 

0.5 

87 

215 

14 

0 

0 

86 

216 

44 

5 

0 

51 

217 

10.5 

0 

0 

89.5 

218 

6 

3.5 

1.5 

89 

219 

0.5 

0 

0 

99.5 

220 

8 

5 

0 

87 

221 

17.5 

5 

0 

77.5 

222 

20 

5 

0 

75 

223 

11.5 

3 

2 

83.5 

224 

49 

4.5 

0 

46.5 

225 

0.5 

12.5 

0.5 

86.5 

226 

8.5 

3.5 

1.5 

86.5 

TOTAL (MEAN) 

24.35294118 

3.823529412 

0.441176471 

71.3823529 

4 

TOTAL (SD) 

26.10661118 

3.381665531 

0.704502327 

26.5574432 

9 


Category values are percentages are as percent of total number (250) of decisions. Also 
depicted (In bold at bottom of table) Is the total percentage of decisions the group spent In 
the corresponding CAPTTIM color category 
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C. RESEARCH QUESTION 2: HYPOTHESES RELATIVE TO FEEDBACK 

PROVIDED TO SUBJECTS AIMED TOWARD OPTIMIZING DECISION 

MAKING 

1. Data Preparation and Statistical Methods 

Because the data did not conform to a Normal distribution curve, we used 
the nonparametric Wilcoxon Rank Sum test to test all hypotheses. A two-tailed 
alpha level of .05 was employed for all statistical tests. We found that there were 
two outliers, one each in the control and feedback group. We observed that in 
both the control and feedback groups there were subjects who achieved an 
unusually high (control group) and low (feedback group) score. The 7850-point 
total score of subject 117 in the control group is two standard deviations above 
the mean for the control group. Similarly, subject 219’s score of -2300 is more 
than two standard deviations below than the mean for the feedback group. We 
will report results with this data included and also briefly discuss results with 
those subjects excluded from the calculations. Specific hypotheses relative to the 
subject performance were; 

• HO^: There is no difference in mean trial number of transition to the 
‘green’ CAPTTIM classification between the feedback and no 
feedback groups. 

• HA^: Feedback group will demonstrate transition to the ‘green’ 
classification of CAPTTIM in fewer trials than subjects who receive 
no feedback. 

• HO 2 : There is no difference in mean total score between feedback 
and no-feedback groups. 

• HA 2 : Subjects who receive feedback during execution of the 
Convoy Task will accumulate a higher total score as compared to a 
no-feedback group. 

• HO3: The proportion of trials in the green classification will not be 
significantly different between feedback and no-feedback groups 

• HA3: Subjects who receive feedback during execution of the 
Convoy Task will achieve a greater proportion of trials in the green 
CAPTTIM classification than a no-feedback group. 
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• HO 4 : The proportion of trials in the red classification will not be 
significantly different between feedback and no-feedback groups. 

• HA 4 : Subjects who receive feedback during execution of the 
Convoy Task will achieve a lesser proportion of trials in the red 
classification of the CAPTTIM model. 

2. Results 

Table 5 summarizes the overall results of each hypothesis detailed above 
and we discuss the detailed results of each conclusion below. 


Table 5. Results of Hypotheses Including Test Statistics and P-values for 

Each Hypothesis. 


Hypothesis 

CONTROL 

FEEDBACK 

STATs 

Conciusion 

description 

mean 

mean 




(SD) 

(SD) 



H1: Feedback group 

115.3 

136.6 

N/A* 

N/A* 

wiii transition* eariier. 

* oniy 3/17 (c) and 

5/17 (f) transition to 
green category. 

(52.7 ) 

(40.7) 



H2: Average Score of 

2782.35 

3617.65 

Z= 1.206 

Retain HO 2 

feedback higher than 
controi. 

(2556.91) 

(2457.07) 

p=0.228 


H3: Proportion in 

.18 

.24 

Z=0.913 

Retain HO 3 

Green of feedback 
group higher than 
controi group. 

(0.24) 

(0.26) 

p=0.361 


H4: Proportion in Red 

.80 

.71 

Z= 1.433 

Retain HO 4 

of feedback group 
iower than controi 

(0.81) 

(0.27) 

p=0.153 


group. 






a. Hypothesis 1 

To address hypothesis 1 we defined a transition to the green CAPTTIM 
category as 20 or more consecutive trials in the green category. Due to the small 
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sample size of subjects who effectively transitioned to the green category based 
on our definition (3 of 17 (-18%) control group and 5 of 17 (-29%) in feedback 
group) we did not statistically test this hypothesis. Excluding the single outlier in 
each group this number becomes even more difficult to analyze effectively with 
only 2 of 16 subjects from the control group transitioning, and 5 of 16 in the 
feedback group. 

b. Hypothesis 2 

Although the results regarding total score were not significant (Z=1.206, 
p=0.228), we observed that in both the control and feedback groups there were 
subjects who achieved an unusually high (control) and low (feedback) score. The 
7850-point total score of subject 117 in the control group is two standard 
deviations above the mean for the control group. Similarly, subject 219’s score of 
-2300 is more than two standard deviations below than the mean for the control 
group. Even while excluding these extreme values we achieve (Z=1.941, 
p=0.052). This value is still not statistically significant, but nearly so. 

c. Hypotheses 3 and 4 

The third and fourth hypotheses are related to each other as both pertain 
to the proportion of trials spent in the red and green CAPTTIM categories 
respectively. Again, while not statistically significant both results trend in the right 
direction: hypothesis 3 (Z=0.913, p=0.361), hypothesis 4 (Z=1.430, p=0.153). 
Subjects who received feedback during execution of the Convoy Task spent a 
lower proportion of decisions in the red category and a greater proportion of 
decisions in the green category than the control-group subjects. As outliers, 
subjects 117 (control group) and 219 (feedback group) had similar impacts to the 
mean proportion of decisions each group spend in the red or green CAPTTIM 
categories. Subject 117 was uncharacteristically in the green for 100% of the 
evaluated decisions. Conversely, subject 219 was in the red for 99.5% of the 
evaluated decisions. If we exclude the two outliers, we still fail to reject the null 
hypothesis for Hypothesis 3 regarding the proportion of trials in the green 
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category, (but by a more slim margin) (Z=1.621, p=0.105). However, we do reject 
the null for Hypothesis 4 regarding the proportion of trials in the red category. 
(Z=2.186, p=0.029). 

D. EXPLORATORY ANALYSIS 

During review of the post-task surveys (Appendix D), subjects in both the 
control and feedback groups (four of 17 subjects and six of 17 subjects 
respectively) correctly identified the most dangerous route as route two. We 
sought to determine if correct identification of the most dangerous route was 
associated with optimal decision-making as defined by our hypotheses of a 
higher total damage score, greater proportion of decisions in the green CAPTTIM 
category or a lesser proportion of decisions in the red CAPTTIM category. Using 
a two-sample f-test to compare the means of the total damage scores and 
CAPTTIM proportions of the two groups (i.e., those that, post-task, correctly 
identified the most dangerous route and those that did not) we find that there is 
no difference in mean score between those that identified the most dangerous 
route (M=3330, SD=2261.78) and those who did not (M=3145.83, SD=2644.52): 
(f(34)=0.206, p=0.581). There also is no significant difference in the proportion of 
decisions that “dangerous route identifiers” (M=32.35, SD=29.08) and “non¬ 
identifiers” (M=16.18, SD=22.17) spent in the green CAPTTIM category (f(34)=- 
0.708, p=0.242) or the red category (f(34)=0.987, p=0.826). So while it is an 
interesting observation that some subjects correctly identify the most dangerous 
route, this sense does not necessarily contribute to optimal decision-making; just 
because a decision-maker can identify factors to avoid making the worst decision 
continuously apparently does not mean they apply an optimal strategy. 
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IV. DISCUSSION 


Decision-making—understanding it, and improving the efficacy of it— 
continues to be a focus of effort throughout the DOD (Odierno & McHugh, 2015 
U.S. Marine Corps, 2012). This thesis sought to further the efforts of previous 
work (Nesbitt et al., 2015; Kennedy et al., 2015; Critz, 2015) in capturing 
decision-making performance and increasing decision-making expertise. The 
primary goals of this thesis were to: (1) adapt a test of reinforcement learning 
(Convoy Task) and a validated model of decision-making classification 
(CAPTTIM) in order to categorize decision performance and cognitive state in 
real time and (2) given that effective real-time categorization, provide feedback to 
subjects whose performance was suboptimal in an effort to improve decision 
performance. The first goal was successfully accomplished. Results pertaining to 
the second goal showed trends toward effective influence of decision makers 
toward optimal decisions. Fine-tuning the model may allow significant results to 
be realized with the small sample size, but also given the trends of the results, 
increasing the sample size may improve the power of the statistical results. This 
final chapter discusses implications of the results, explores some limiting factors 
that were not explored statistically as part of the research and addresses areas 
of future work that should be explored. 

A. IMPLICATIONS 

The Convoy Task that was modified from Critz (2015) and Nesbitt et al. 
(2015) maintains a structure that requires subjects to be adaptive, mentally agile, 
and demonstrate reasoned decision-making skills. As recommended by Critz 
(2015), this thesis successfully modified CAPTTIM to act as a tutor to guide 
subjects toward optimal decision-making. Based on results from this thesis, the 
Convoy Task and CAPTTIM offer an enhanced capability to aid DOD research 
toward developing more effective decision makers. 
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The modification and employment of the Convoy Task and the effective, 
real time, CAPTTIM categorization may open further study into understanding 
and instructing optimal decision-making. The ability to categorize a decision 
maker’s cognitive state with their decision performance in real time could allow 
training systems to be designed to tailor training to the individual decision maker. 
CAPTTIM could be used to interrupt training that is trending toward suboptimal 
decision-making performance when a subject’s cognitive state is misaligned with 
their decision performance. Further, if future experiments demonstrate the ability 
to significantly change subject behavior during execution of a task, training 
exercises (whether task trainers, learning uptake exercises, etc.) may be 
designed with a built in mechanism to guide suboptimal performers by way of in- 
process feedback that takes into account the performer’s cognitive state; similar 
to the tailored guidance messages employed in this thesis. 

Although this experiment consisted of a relatively simple task, the concept 
of categorizing both cognitive state and decision performance in real time can be 
expanded to existing training simulations that require multiple, complex, chained 
decisions where each decision can be influenced toward the optimal decision in 
order to maximize training value and improve the cognitive skills and effective 
decision making of small unit leaders. This idea is aligned with previous research 
in training effectiveness that suggests that “training interventions are required to 
improve teamwork skills, such as decision making, communications, shared 
situation awareness, leadership, and co-ordination, to ensure efficient team 
functioning. Such training results in more effective and efficient decision making 
accelerated proficiency and the development of expertise in individuals and 
teams” (Crichton & Flin, 2001, p. 259). This intervention is precisely the type of 
response that was attempted here; when a subject has made one, or a series, of 
incorrect decisions there is now a mechanism that can alert the subject to the 
suboptimal performance. More than just pointing out a wrong answer, this 
research categorized a subject’s ability \.o make correct decisions and a system’s 
or trainer’s ability act upon that categorization to help the subject make better 
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decisions. Furthermore, the ability to guide a subject to understanding the 
problem at hand, and how to properly act within the decision environment—as 
opposed to merely pointing out the correct answer to a single task or situation— 
is crucial for learning and developing effective decision making expertise (Archer, 
2010 ). 

The ability to incorporate objective decision-making measures to any 
existing simulation, and demonstrate the optimal decision path (or the ability to 
correct deviation from it) may also reduce the time required in the trial and error 
phase of reinforcement learning, resulting in savings of time and money required 
to train military decision makers. Our results suggest that it is, in fact, possible to 
understand the decision-maker’s cognitive state in real time with simple 
behavioral measures. And, with further refinement to tailored feedback, this 
understanding will allow future leaders, instructors and trainers to leverage the 
power of this approach to improve the processes and methods used to 
understand effective decision-making. 

B. LIMITATIONS 

Observations during the collection and analysis of data for this thesis 
revealed potential limitations to the method and results presented above. Given 
the data-driven nature of the experiment and the neutrality of the software 
program capturing data, and classifying subjects in accordance with the 
CAPTTIM model, it is unlikely that these issues had an impact on results but 
should be discussed in order to improve future efforts using the same or similar 
methodology. 

1. Feedback to Subjects 

Post-task surveys, and comments volunteered by some subjects while 

being debriefed about the study, suggested that the messages offered to the 

feedback group might have caused confusion. (See Appendix D). Taken as a 

whole, these comments suggest that the timing and frequency of the feedback 

messages could be refined. For example, one subject reported that the 
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comments provided in the pop-up windows were not timely enough to capture 
their immediate actions and their perceived performance at that moment in the 
experiment. The feedback windows, having been programmed to collect 
cognitive state, decision performance, and current CAPTTIM categorization 
every 10 trials and display it to the subject, may not account for subject strategy 
changes within this ten-decision window. This subject specifically noted that a 
feedback window directed them to “...attend to friendly damage and try other 
routes” (this indicates red CAPTTIM categorization). However, this subject, by 
their own recollection, had already made an adjustment to decision strategy and 
was beginning to make progress away from the red CAPTTIM category, but was 
then confused by the advice to try other routes. This same subject further stated 
that they considered the feedback windows might be experimentally designed as 
a distraction meant to be overcome by individual assessment of perceived 
performance, despite instructions that such feedback would be offered to guide 
subjects to optimal decision making. A closer inspection of the data showed that 
this subject’s final Accumulated Damage score was an outlier beyond 2 SD 
below the mean score of the feedback group. 

Conversely, another subject commented that the thesis might choose to 
“study how annoying those pop-up windows are.” This subject had quickly 
recognized the long-term, overall, safest route and had adopted the optimal 
decision-making strategy to maximize Accumulated Damage score as per the 
instructions, and thus did not need the guidance to “...stay with your strategy” 
every tenth trial. This subject received the same message every tenth trial 
despite continued green CAPTTIM performance. In order to control possible 
confounding factors for this research, the conditions for eliminating the messages 
if a subject remained in green CAPTTIM category for a certain number of trials 
were not included. Dynamic intervention intervals should be added to the system 
for subsequent research in order to allow optimally aligned decision-making to 
continue uninterrupted. 
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Options for displaying the feedback to subjects every 15*'^ trial, in line with 
the original change point value used in Critz (2015), or displaying feedback 
messages only when the CAPTTIM categorization changed were explored during 
the early phases of designing the present study. Ultimately, these approaches 
were dismissed in favor of a design that presented notification every tenth trial in 
order to maintain a uniform number of messages to subjects. Experimentally, this 
design provided a standard number of opportunities to influence performance 
and also eliminated a potentially confounding variable (variability in the frequency 
and timing of feedback messages) that would likely have threatened internal 
validity of this study. 

2. Identification of Bad Routes Vice Optimal Decisions 

Some subjects reported that they recognized the long-term danger of 
routes one and two, but also that these routes were safe for a set number of 
decisions before the imposition of high Friendly Damage. (See Appendix A). 
Thus, subjects stated that they were attempting to maximize score by selecting a 
known-unsafe route right up to the decision that would result in losing points but 
never figured out the pattern precisely enough to achieve a maximum score by 
this method; they attempted to “game the game,” but were rarely successful. 
Subjects who continue to make sub-optimal decisions, regardless of a score 
indicating otherwise, and messages indicating a flawed strategy, may be so 
focused on trying to game the system that they do not recognize their poor 
performance. Attentional tunneling, attending to a task or goal for longer than is 
optimal (Wickens & Alexander, 2009), is further evidence supporting the need to 
notify subjects of poor performance. Although the results of exploratory analysis 
suggested that subjects who identified the most dangerous route performed no 
differently than those who failed to do so, the use of such a strategy results in 
either inefficient allocation of cognitive resources during task completion, or a 
failure to recognize a more optimal strategy than their current decision-making 
approach. This situation can, however, be accounted for in the model. Critz 

(2015) modified the CAPTTIM model to automatically place the subject into the 
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red category if they chose route two after trial 100. This methodology was not 
included in the real time categorization used here, as it was important to gather 
the subject’s data and attempt to influence decisions. Because of this goal, 
automatically placing a subject in the red category would nullify some 
opportunities to evaluate cognitive state and regret to influence future decisions. 
Empirically based refinements to increase the sensitivity of the real-time data 
capture and analyses of CAPTTIM will enable finely tuning feedback messages 
and present the opportunity to address and avoid attentional tunneling. Advances 
in this area will increase the likelihood of keeping subjects from pursuing an 
ineffective strategy when it is evident that cognitive state is not aligned with 
performance. 

C. FUTURE WORK 

Based on the successful demonstration of real-time CAPTTIM 
categorization of decision making, and the trend towards influencing decision 
makers toward optimal decision making, future work should focus on: fine tuning 
the Convoy Task application (and incorporation of the CAPTTIM model therein) 
to ensure precise capture of CAPTTIM category; refining the feedback messages 
to more effectively influence decision performance; expanding the application of 
the Convoy Task and CAPTTIM to a population outside of NPS or a larger 
sample from the current population. Other areas of future work include using eye 
gaze patterns and individual difference factors such as head injury status to gain 
greater insights into why some subjects do not reach optimal decision-making. 
These areas are discussed below. 

1. Refinement of CAPTTIM Coding and Feedback Messages 

As mentioned, a limitation of the model as presently implemented is the 
rigidity of the feedback to subjects, and the confusion or frustration that this may 
cause to subjects. Ultimately the goal of ongoing research is to develop a system 
that is sensitive enough to detect when a subject is significantly off the optimal 
decision-making path and provide appropriate feedback to get them on the path 
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at the “right” time. This goal can be accomplished through additional and/or more 
refined messages to each subject. For this research, messages were developed 
that correspond to each of the four CAPTTIM categories, each message 
attempting to influence subjects’ decision-making toward the green category. A 
future study should investigate the use of additional messages if a subject 
remains in the red CAPTTIM category after being alerted once or twice or three 
times that they should change strategy. Similarly the efficacy of displaying 
messages more frequently if a subject is in a suboptimal state (red, yellow or 
orange), and not at all if the subject has achieved the green category for 10 or 
more consecutive trials, should be investigated. 

2. Expand Population of Interest 

Although the sample for this study (drawn from current, active duty, NPS, 
officer-students) was uniquely suited to examine decision-making in a military 
themed task, an investigation including a broader demographic should be 
conducted. A larger, less homogeneous military population could include 
decision-makers of various ranks and experience, or from units and institutions 
not specifically focused on graduate level education. A typical, standing military 
unit is comprised of members of varied ranks and education levels, different 
decision-making requirements and different approaches to decision-making. As 
evidenced by age/military experience data from the sample in this study, the 
population at NPS has a considerable amount of decision-making experience, 
and brings the biases associated with experience to the task. It is possible that 
this experience caused decisions that were not anticipated in the coding for 
feedback messages. Thus, once the code is refined to account for differences in 
experience, the general approach used here could serve as the framework to 
examine the decision strategy of entry-level military members and compare those 
strategies to a group that has been educated and evaluated (possibly through 
real world experience) in crucial decision-making environments. Junior members, 
if they were on the optimal decision-making path (i.e., in the green CAPTTIM 

category) would be left to continue the immediate decision-making task. Senior, 
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more experienced subjects, could be evaluated against a tighter standard of 
effectiveness; i.e., achieving the green category more quickly or requiring less 
focused feedback to adjust errors in strategy. 

3. Expand to a More Complex Task 

Subjects in our Convoy Task only had one decision to make repeatedly— 
which one of four routes to send your convoy. The CAPTTIM model for 
categorizing decision-making can be applied to each decision in a changing 
environment. Strategy, first-person-shooter, flight simulation and even board 
games require a series of decisions that are unique to the situation at hand but 
all may be categorized by the CAPTTIM method as the best possible decision at 
that time, or some suboptimal fraction of the best possible decision. Applying the 
evaluation and feedback approach demonstrated in this thesis to a more complex 
task may reveal facets of decision-making (and its effectiveness) that are not 
realized when a subject is faced with the same decision over and over again. 
While the Convoy Task—and the underlying IGT—have been shown repeatedly 
to effectively capture decision-making performance, a deviation from this singly 
focused task would be illuminating. 

4. Use of Eye Gaze Data 

As mentioned in the Procedures Section, eye-tracking cameras were used 
to capture the gaze point on the task screen of each subject. These data were 
beyond the scope of this thesis. However, as the data is collected and preserved 
it could be examined retrospectively to determine if there is a difference in gaze 
points between high and low scoring subjects or between feedback and control 
subjects. It may be informative to know if the subjects in the feedback group 
really spent any significant time reading the messages that were displayed to 
them regarding adjusting strategy or if they allocated more attention to the most 
relevant piece of information. Damage to Friendly Forces. It also would be 
informative to see if those subjects who attempted to ‘game the game” were less 
likely to attend to the Accumulated Damage score. 
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5. The Role of Head Injury Incidence in Explaining Some of the 
Large Variability in Convoy Task Scores 

Similarly, self-reported head injury incidence was collected but the 
analysis of this data was outside the scope of this thesis. Head injury incidence 
can be used as an indicator of TBI. Future effort may be used to examine this 
data and whether the role of head injury incidence explains some of the large 
variability in convoy task performance, or whether those with a history of frequent 
and/or severe head injuries would differentially benefit from feedback than those 
without such a history. We balanced those with varying degrees of head injury 
incidence between the two groups, but future studies may block all subjects with 
indicators of TBI into one group to see if the feedback has any effect given the 
history of brain injury. Previous results suggest that those with self-reported TBI 
show unusual decision performance patterns (Kennedy, Adamson, Huston & 
Nesbitt, 2015). 

D. CONCLUSION 

Decision-making is an everyday task that takes on greater significance to 
military professionals, first responders, or others faced with outsized impacts of a 
given set of decisions. Future U.S. military capability will be evaluated on the 
ability of military members’ effective, agile, adaptive, and innovative decision¬ 
making (Odierno & McHugh, 2015). Rather than the acquisition of material 
solutions, development of personnel lends gravity to the research conducted 
here. More than just a necessity driven by budget cuts, advances in technology 
and application of innovative methods of simulated and virtual-environment 
training is an opportunity to improve performance of the modern military. The 
tasks and situations faced by every military member call for advanced 
understanding of the individual’s decision-making capability, and development of 
the same in a manner never expected of previous generations. 

We have shown that it is possible to capture the cognitive state and 
decision performance of subjects in real time. There are myriad factors that drive 
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an individual’s decision-making strategy which require further exploration. 
However, by continuing to explore this process, this research moves closer to 
effective development of continuous, objective, measures and analysis capability 
for long-term tracking of decision-making skills. Understanding and influencing 
military decision-making is astutely paired with advances in virtual environments 
and simulated training. Investigating, developing and applying innovative 
approaches to training and education and incorporating the evaluation and 
intervention strategy applied here increases the potential to effectively train 
optimal decision-making in less time. 
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APPENDIX B. CONVOY TASK CODE 


A. CONTROL GROUP CODE 

# Optimal Decision Making Demonstration 

# Military Wargaming, Convoy Route Selection 

# 

# Multi Arm Bandit (n=4) 

# In support of TRAC Project Code 638 

# author: Peter Nesbitt and Cardy Moten III, TRAC-MTRY 

# peter.nesbitt@us.army.mil orcardy.moten3.mil@mail.mil 

# addition/modification of COGNITIVE STATE and REGRET 

# Travis Carlson, MOVES, NPS 

# - # 

# IMPORTS # 


from random import * 
import random 
import numpy as np 
import time 

from time import localtime, strftime 
from math import * 
import Tkinter 
from Tkinter import * 
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import tkMessageBox 

import tkFont 

import Tkinter as tk 

from PIL import Image, ImageTk 

import winsound 

import CSV 

import array 

from datetime import datetime 
import calendar 


# - # 

# FUNCTIONS AND CLASSES # 

# - # 


class Player: 

def init (self,onhand,plays): 

seif.oh = onhand 
seif.p = plays 

class Bandit: 

def init (self, l_gain, Moss, l_payoff): 

seif.gain = l_gain # dictionary of initial bandit parameters 
seif.loss = Moss 

seif.po = l_payoff # total earned for that machine 
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class Click: 


def_init_(self,xcoord,ycoord): 

seif.x = xcoord # x for every click on canvas 
seif.y = ycoord # y for every click on canvas 

class Control: 

def_init_(self,l_routeUse): 

seif.route = l_routeUse 
self.playlimit= 250 

class DecideTime: 
def_init_(self): 

seif.start = time.time() # time since last decision 

class Application(Frame): 

def_init_(self, master=None): 

Frame._init_(self, master) 

seIf.packO 
seIf.buildFrameO 
seif.remaining = 0 

def restart(self, remaining = None): 
if remaining is not None: 
seif.remaining = remaining 
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if seif.remaining <= 0: 
self.hurry.configure(text=“time’s up!”) 
else: 

self.hurry.configure(text=“%d” % seif.remaining) 
seif.remaining = seif.remaining -1 
self.after(1000, seif.countdown) 


def buildFrame(self, remaining = None): 

# self.buildFrame2() 

seIf.customFontI = tkFont.Font(family=“Arial Bold,” size=30) 
self.customFont2 = tkFont.Font(family=“Arial Bold,” size=20) 
seIf.customFontS = tkFont.Font(family=“Arial Bold,” size=10) 
self.customFont4 = tkFont.Font(family=“Arial Bold,” size=20) 


topLabel = Label(self,text=“Select route for next 
font=self.customFont2).grid(row=0,column=2, pady=25) 

#topLabel = Label(self,text=“Select route for next 
font=self.customFont2).grid(row=0,column=1, columnspan=3, pady=25) 

#Label(self,text= ““).grid(row=5,column=2) 

Label(self,text= “Damage to Enemy 

font=self.customFont2).grid(row=30,column=1,pady=25) 


#Label(self,text=0, 

font=self.customFont1).grid(row=7,column=2) 


convoy.,” 

convoy.,” 


Forces,” 


=“black,” 
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11 


fg=“black, 


Label(self,textvariable=v_gain, 
font=self.customFont1).grid(row=25,column=1, pady = 0) 

Label(self,text= “Damage to Friendly Forces,”fg=“red,” 
font=self.customFont2).grid(row=30,column=4) 

12 = Label(self,textvariable=vJoss, fg=“red,” 

font=self.customFont1).grid(row=25,column=4, pady = 0) 

b_1 = Button(self,command=bdt1,bg=‘white’) 
b_2 = Button(self,command=bdt2,bg=‘white’) 
b_3 = Button(self,command=bdt3,bg=‘white’) 
b_4 = Button(self,command=bdt4,bg=‘white’) 

self.photo1=lmageTk.Photolmage(file=“Picture2.png”) 

self.photo2=lmageTk.Photolmage(file=“Picture2.png”) 

self.photo3=lmageTk.Photolmage(file=“Picture2.png”) 

self.photo4=lmageTk.Photolmage(file=“Picture2.png”) 

b_1 .config(image=self.photo1, width=“400,”height=“400”) 
b_2.config(image=self.photo2, width=“400,”height=“400”) 
b_3.config(image=self.photo3, width=“400,”height=“400”) 
b_4.config(image=self.photo4, width=“400,”height=“400”) 

b_1 .grid(row=6, column=1) 
b_2.grid(row=6, column=2) 
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b_3.grid(row=6, column=3) 
b_4.grid(row=6, column=4) 


130 = Label(self, text=,”“ fg= 

font=self.customFont3,anchor=E).grid(row=3,column=1, pady=20) 

135 = Label(self, text=,”“ fg= 

font=self.customFont3,anchor=E).grid(row=7,column=2, pady=20) 

136 = Label(self, text=,”“ fg= 

font=self.customFont3,anchor=E).grid(row=8,column=2, pady=20) 

13 = Label(self,text=“Accumulated Damage :,”fg= 

font=self.customFont2,anchor=E).grid(row=1 ,column=2, pady=20) 

14 = Label(self,textvariable=v_onhand, fg 

font=self.customFont1 ).grid(row=1 ,column=3, pady=25) 

#I5 = Label(self,text=“(Positive number is good),”fg= 

font=self.customFont2,anchor=E).grid(row=1,column=4) 


def callback(e): 
click.x = e.x 
click.y = e.y 

# print “clicked at,” e.x, e.y 

def WriteToFile(listArray, subName): 
with open(subName, ‘wb’) as csvfile; 
w = csv.writer(csvfile) 


“graySO,” 

“graySO,” 

“graySO,” 

“graySO,” 

=‘gray50’, 

“graySO,” 
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w.writerow([‘triar] + [‘routeSel’] + [‘trialGain’] + [‘trialLoss’] +[‘Damage’] + 
[‘x’] + [‘y’] + [‘latent’] + [‘unixTime’]+ [‘machTime’] + [‘cogState’] +[‘CAPTTIM’]) 

for e in listArray: # for every trial data array, 

w.writerow(e) # write it to file 

def readFromFile(): 
with open(‘TDC.csv’, ‘rb’) as f: 
reader = csv.reader(f) 
for row in reader; 
latentlistR = row[7] 
choiceRead = row[1] 

print “route: choiceRead, “latency: latentlistR 


#print row[3] 

# - # 

# GLOBAL CONSTANTS # 

# - # 


runData = [] # storage tuple to temp store data 
latentList = [] #store latency times 
latentListR = [] 

latentListSO = [] #have to use something different for first 50 becuase we 
need the whole latentList later 

avglatentList = [] #store EWMA latency times 
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avgLatencyTime = 0 

latencyLambda = 0.9 #EWMA lambda parameter 
sdLatencyTime = 0.0 #standard deviation of latency time 
cogState = ‘test’ #Cognitive State string 
baseLineLatency = 0 
#List of intervention messages 

messageList = [“Score could be better; attend to friendly damage,” 

“Score could be better; attend to friendly damage and try other 

routes,” 

“Score is looking good; go ahead and make decisions quickly,” 

“Score is looking good; stay with your strategy”] 

CAPTTIM = ‘ ‘ 

gainList = [] #Capture absolute regret 
medGain = [] #Capture median regret values 

subName = strftime(“%Y %b %d %a %H %M Mil MultiArmBandit.csv,” 
localtimeO) # time as file name 

root = Tkinter.Tk( ) 

# Player parameters 

X = Player(2000,0) # instantiate player object onhand,plays 

irouteUse= {} 

irouteUse[1]= 0 

irouteUse[2]= 0 

irouteUse[3]= 0 

irouteUse[4]= 0 
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game = Control(irouteUse) 
time_to_decide = DecideTime() 

click = Click(0,0) 

v_onhand = DoubleVar() # instantiate running total onhand 
v_onhand.set(x.oh) # running total from all machines 
□color = ‘black’ 
v_gain = DoubleVar() 

v_gain.set(0) # running total from all machines 
vjoss = DoubleVarO 

v_loss.set(0) # running total from all machines 
v_plays = IntVarO 
#v_plays.set(game.playlimit-x.p) 
v_plays.set(x.p) 

v_bdt1 = DoubleVarO # last payoff value for machine 1 
v_bdt2 = DoubleVarO 
v_bdt3 = DoubleVarO 
v_bdt4 = DoubleVarO 

v_gain1 = DoubleVarO 
v_gain2 = DoubleVarO 
v_gain3 = DoubleVarO 
v_gain4 = DoubleVarO 
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# Bandit parameters held in a dictionary 
igain= {} 

igain[1]= 100 #bandit 1: n1,p1,n2,p2 
igain[2]= 100 
igain[3]= 50 
igain[4]= 50 


150, 


iloss= {} 

iloss[1]= [-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,- 

0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0, 

-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0, 

-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0, 

-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350, 

-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0, 

-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150, 

0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,- 


150,0,0 


-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150, 

0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0, 

-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0, 

-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0, 

-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350, 
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-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0, 


-350,-250,0,-200,0,-300,0,-150, 


0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,- 


150,0,0 


-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250] 


iloss[2]= 

1250,0,0,0,0,0,0,0,0,0, 


[-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,- 


-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0] 


iloss[3]= 

50,0,-50,0, 


[-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,- 


-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,- 


50,0,0, 


-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,- 


50, 
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50,0, 


0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,- 

0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0, 
-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0, 
-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50, 
-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0, 
-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0, 
0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0, 
-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0, 
-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50, 
-50,-50,0,-50,0,-50,0,-50,0] 


iloss[4]= 

250,0,0,0,0,0,0,0,0,0, 


[-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,- 


-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 
-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 
-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 
-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 
-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 
-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 
-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 
-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 
-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0] 
print ‘Deck A is’,len(iloss[1]),’cards long.’ 
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print ‘Deck B is’,len(iloss[2]),’cards long, 
print ‘Deck C is’,len(iloss[3]),’cards long, 
print ‘Deck D is’,len(iloss[4]),’cards long. 

ipayoff= {} 
ipayoff[1]= 0 
ipayoff[2]= 0 
ipayoff[3]= 0 
ipayoff[4]= 0 

b=Bandit(igain,iloss,ipayoff) 

v_bdt1.set(b.po[1]) 

v_bdt2.set(b.po[2]) 

v_bdt3.set(b.po[3]) 

v_bdt4.set(b.po[4]) 

v_gain1.set(0) 

v_gain2.set(0) 

v_gain3.set(0) 

v_gain4.set(0) 

def refresh(): 
app.mainloopO 
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def bdt1(): 
machine= 1 
gain=b.gain[machine] 
loss=b.loss[machine] 
getGain(gain,loss,machine) 

def bdt2(): 
machine= 2 
gain=b.gain[machine] 
loss=b.loss[machine] 
getGain(gain,loss,machine) 

def bdt3(): 
machine= 3 
gain=b.gain[machine] 
loss=b.loss[machine] 
getGain(gain,loss,machine) 

def bdt4(): 
machine= 4 
gain=b.gain[machine] 
loss=b.loss[machine] 
getGain(gain,loss,machine) 
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#Display a modal pop-up info box with the supplied message string 
def displayDialog(message): 
tkMessageBox.showinfo(“Guidance,” message) 

def getGain(gain,loss,mach): 

CAPTTIM = “ 
cogState = “ 
gainP = gain 
lossP = -1*loss.pop() 
gain = gainP - lossP 

latent = time.time() - time_to_decide.start 
time_to_decide.start = time.time() 
game.route[mach] += 1 

b.po[mach] = b.po[mach] + gainP + lossP # update earning by machine 
x.oh = x.oh + gain # update earnings total, subtracting any cost to play 
if x.oh < 0: 

□color = ‘red’ 
else: 

□color = ‘black’ 
v_onhand.set(x.oh) 

x.p = x.p + 1 # update times game played 

dt = datetime.now() 
machTime= dt.time() 
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unixTime = calendar.timegm(dt.utctimetuple()) 


if x.p <= 50: 

selData=[x.p, mach, gainP,lossP,x.oh, click.x, click.y, 
latent,unixTime,machTime, CAPTTIM] 

runData.append(selData) # store data 


# Cognitive State # 


if x.p<=2: 

avgLatencyTime = latent 

avglatentList.append(latent) #Store Average Latency Time 
latentList.append(latent) 
latentList50.append(latent) 
else: 

if gain >= 0: ##only append the latency time to the list if the choice is 

not ‘bad’ 

latentList.append(latent) 

latentList50.append(latent) 

#Compute EWMA Latency from Nesbitt Understanding Optimal 
Decision Making 

avgLatencyTime = latencyLambda*latentList[len(latentList)-1] + (1- 
latencyLambda)*avglatentList[len(avglatentList)-2] 
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#Store Average Latency Time of these GOOD decisions 
avglatentList.append(avgLatencyTime) 
else: 

avgLatencyTime = latencyLambda*latentList[len(latentList)-1] + (1- 
latencyLambda)*avglatentList[len(avglatentList)-2] 

#Still computing the average latency time, just not appending it to the 
list when subject takes a hit during first 50 trials 

print “bad choice, latency not added to avgLatentList” 

baseLineLatency = np.mean(latentList50) 


else: 

latentList.append(latent) #still have to capture all the raw times? We 
should be using EWMA for Exp/Exp 

baseLineLatency = np.mean(latentList50) ##there’s nothing added to 
latentList after trial 50, so baseLineLatency stays the same 

STDofBaseLineLatency = np.std(latentList50) #Compute the standard 
deviation of the latency time 

avgLatencyTime = latencyLambda*latentList[len(latentList)-1] + (1- 
latencyLambda)*avglatentList[len(avglatentList)-2] 

avglatentList.append(avgLatencyTime) 

print “SD of baseline,” STDofBaseLineLatency 

#get the mean of the last 10 latency values from the overall list 

LastlOAvg Latencies = avglatentList[len(avglatentList)- 

10:len(avglatentList)] 
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STDLastIO = np.std(LastlOAvgLatencies) 
print “STD of Last 10 trials: STDLastIO 

if STDLastIO <= 1.5*STDofBaseLineLatency: 
cogState = ‘Exploit’ 

elif STDLastIO > 1.5*STDofBaseLineLatency: 
cogState = ‘Explore’ 
print “CogState is: %s’’ %cogState 


# Regret # 


gainList.append(gain) 
regret = -gain 

for check in range(50,game.playlimit,10): 

checkLastIO = gainList[len(gainList)-10:len(gainList)] #take the most 
recent 10 gain values from the overall list 

checkLastlS = gainList[len(gainList)-15:len(gainList)] #take the most 
recent 15 gain values from the overall list 

averageLastIO = np.average(checkLastlO) 

medianLastIO = np.median(checkLastlO) #the median of the above list 
to compare against most recent trial 

medianLast15 = np.median(checkLast15) 


72 



averageLast15 = np.average(checkLast15) 


if x.p == check: 

print “The last 10 gains: checkLastIO 
print “median of last 10 trials: medianLastIO 
print “average of last 10 trials averageLastIO 

if averageLastIO < medianLastIO and cogState == ‘Explore’: 

CAPTTIM = “YELLOW” #the inequality above “ave > median” is the 
definintion of gain (note line 381 that regret is opposite of gain) 

#displayDialog(messageList[0]) 

elif averageLastIO < medianLastIO and cogState == ‘Exploit’: 
CAPTTIM = “RED” 

#displayDialog(messageList[1]) 

elif averageLastIO >= medianLastIO and cogState == ‘Explore’: 
CAPTTIM = “ORANGE” 

#displayDialog(messageList[2]) 

elif averageLastIO >= medianLastIO and cogState == ‘Exploit’: 
CAPTTIM = “GREEN” 

#displayDialog(messageList[3]) 
print “CAPTTIM,” CAPTTIM 

## Compute CAPTTIM for every trial and append it to the selection data 
for later evaluation of proportion of time in R/Y/O/G 
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if x.p>50: 

if averageLastIO < medianLastIO and cogState == ‘Explore’: 
CAPTTIM = “YELLOW” 

elif averageLastIO < medianLastIO and cogState == ‘Exploit’: 
CAPTTIM = “RED” 

elif averageLastIO >= medianLastIO and cogState == ‘Explore’: 
CAPTTIM = “ORANGE” 

elif averageLastIO >= medianLastIO and cogState == ‘Exploit’: 
CAPTTIM = “GREEN” 

selData=[x.p, mach, gainP,lossP,x.oh, click.x, 
latent,unixTime,machTime,cogState,CAPTTIM] 

runData.append(selData) 

v_gain.set(0) 
vJoss.set(O) 
if mach == 1: 
v_bdt1 .set(b.po[mach]) 
v_gain1 .set(gain) 
if mach == 2: 
v_bdt2.set(b.po[mach]) 
v_gain2.set(gain) 
if mach == 3: 
v_bdt3.set(b.po[mach]) 
v_gain3.set(gain) 


click.y. 
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if mach == 4; 


v_bdt4.set(b.po[mach]) 

v_gain4.set(gain) 

v_plays.set(x.p) 

v_gain.set(gainP) 

vJoss.set(-1*lossP) 

if x.p >= game.playlimit: 
print (“\n\nPLAY LIMIT MET\n\n”) 
WriteToFile(runData,subName) 
print (‘shut down’) 
root.quit()\ 

if_name_== “_main_ 

app = Application(master=root) 

app.master.title(“Route Selection and Battle Damage Tool”) 

app.master.minsize(1000,400) 

root.bind(‘‘<1>,” callback) 

app.mainloopO 

root.destroyO 


FEEDBACK GROUP CODE 

# Optimal Decision Making Demonstration 

75 



# Military Wargaming, Convoy Route Selection 

# 

# Multi Arm Bandit (n=4) 

# In support of TRAC Project Code 638 

# author: Peter Nesbitt and Cardy Moten III, TRAC-MTRY 

# peter.nesbitt@us.army.mil orcardy.moten3.mil@mail.mil 

# addition/modification of COGNITIVE STATE and REGRET 

# Travis Carlson, MOVES, NPS 

# - # 

# IMPORTS # 

# - # 

from random import * 

import random 
import numpy as np 
import time 

from time import localtime, strftime 

from math import * 

import Tkinter 

from Tkinter import * 

import tkMessageBox 

import tkFont 

import Tkinter as tk 

from PIL import Image, ImageTk 

import winsound 
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import CSV 
import array 

from datetime import datetime 
import calendar 


# - # 

# FUNCTIONS AND CLASSES # 

# - # 


class Player: 

def_init_(self,onhand,plays): 

seif.oh = onhand 
seif.p = plays 

class Bandit: 

def_init_(self, l_gain, IJoss, l_payoff): 

seif.gain = l_gain # dictionary of initial bandit parameters 
seif.loss = IJoss 

seif.po = l_payoff # total earned for that machine 

class Click: 

def_init_(self,xcoord,ycoord): 

seif.x = xcoord # x for every click on canvas 
seif.y = ycoord # y for every click on canvas 
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class Control: 


def_init_(self,l_routeUse): 

seif.route = l_routeUse 
self.playlimit= 250 

class DecideTime; 
def init (self): 

seif.start = time.time() # time since last decision 

class Application(Frame): 

def init (self, master=None): 

Frame._init_(self, master) 

seIf.packO 
seIf.buildFrameO 
seif.remaining = 0 

def restart(self, remaining = None): 
if remaining is not None: 
seif.remaining = remaining 
if seif.remaining <= 0: 
self.hurry.configure(text=“time’s up!”) 
else: 

self.hurry.configure(text=“%d” % seif.remaining) 
seif.remaining = seif.remaining -1 
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self.after(1000, seif.countdown) 


def buildFrame(self, remaining = None): 

# self.buildFrame2() 

seIf.customFontI = tkFont.Font(family=“Arial Bold,” size=30) 
self.customFont2 = tkFont.Font(family=“Arial Bold,” size=20) 
seIf.customFontS = tkFont.Font(family=“Arial Bold,” size=10) 
self.customFont4 = tkFont.Font(family=“Arial Bold,” size=20) 


topLabel = Label(self,text=“Select route for next convoy., 

font=self.customFont2).grid(row=0,column=2, pady=25) 

#topLabel = Label(self,text=“Select route for next convoy., 

font=self.customFont2).grid(row=0,column=1, columnspan=3, pady=25) 

#Label(self,text= ““).grid(row=5,column=2) 

Label(self,text= “Damage to Enemy Forces, 

font=self.customFont2).grid(row=30,column=1,pady=25) 


#Label(self,text=0, fg=“black, 

font=self.customFont1).grid(row=7,column=2) 

11 = Label(self,textvariable=v_gain, fg=“black, 

font=self.customFont1).grid(row=25,column=1, pady = 0) 

Label(self,text= “Damage to Friendly Forces,”fg=“red, 

font=self.customFont2).grid(row=30,column=4) 


12 = Label(self,textvariable=vJoss, fg=“red, 

font=self.customFont1).grid(row=25,column=4, pady = 0) 



b_1 = Button(self,command=bdt1,bg=‘white’) 
b_2 = Button(self,command=bdt2,bg=‘white’) 
b_3 = Button(self,command=bdt3,bg=‘white’) 
b_4 = Button(self,command=bdt4,bg=‘white’) 

self.photo1=lmageTk.Photolmage(file=“Picture2.png”) 

self.photo2=lmageTk.Photolmage(file=“Picture2.png”) 

self.photo3=lmageTk.Photolmage(file=“Picture2.png”) 

self.photo4=lmageTk.Photolmage(file=“Picture2.png”) 

b_1 .config(image=self.photo1, width=“400,”height=“400”) 
b_2.config(image=self.photo2, width=“400,”height=“400”) 
b_3.config(image=self.photo3, width=“400,”height=“400”) 
b_4.config(image=self.photo4, width=“400,”height=“400”) 

b_1 .grid(row=6, column=1) 
b_2.grid(row=6, column=2) 
b_3.grid(row=6, column=3) 
b_4.grid(row=6, column=4) 

I30 = Label(self, text=,”“ 

font=self.customFont3,anchor=E).grid(row=3,column=1, pady=20) 


=“gray50,” 
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135 = Label(self, text=,”“ fg=“gray50,” 

font=self.customFont3,anchor=E).grid(row=7,column=2, pady=20) 

136 = Label(self, text=,”“ fg=“gray50,” 

font=self.customFont3,anchor=E).grid(row=8,column=2, pady=20) 

13 = Label(self,text=“Accumulated Damage :,”fg=“gray50,” 
font=self.customFont2,anchor=E).grid(row=1 ,column=2, pady=20) 

14 = Label(self,textvariable=v_onhand, fg=‘gray50’, 

font=self.customFont1 ).grid(row=1 ,column=3, pady=25) 

#I5 = Label(self,text=“(Positive number is good),”fg=“gray50,” 

font=self.customFont2,anchor=E).grid(row=1,column=4) 


def callback(e): 
click.x = e.x 
click.y = e.y 

# print “clicked at,” e.x, e.y 


def WriteToFile(listArray, subName): 
with open(subName, ‘wb’) as csvfile: 
w = csv.writer(csvfile) 

w.writerow([‘triar] + [‘routeSel’] + [‘trialGain’] + [‘trialLoss’] +[‘Damage’] + 
[‘x’] + [‘y’] + [‘latent’] + [‘unixTime’]+ [‘machTime’] + [‘cogState’] +[‘CAPTTIM’]) 

for e in listArray: # for every trial data array, 

w.writerow(e) # write it to file 

def readFromFile(): 
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with open(TDC.csv’, ‘rb’) as f: 
reader = csv.reader(f) 
for row in reader: 


latentlistR = row[7] 
choiceRead = row[1] 

print “route: choiceRead, “latency: latentlistR 


#print row[3] 

# - # 

# GLOBAL CONSTANTS # 

# - # 


runData = [] # storage tuple to temp store data 
latentList = [] #store latency times 
latentListR = [] 

latentListSO = [] #have to use something different for first 50 becuase we 
need the whole latentList later 

avglatentList = [] #store EWMA latency times 

avgLatencyTime = 0 

latencyLambda = 0.9 #EWMA lambda parameter 
sdLatencyTime = 0.0 #standard deviation of latency time 
cogState = ‘test’ #Cognitive State string 
baseLineLatency = 0 
#List of intervention messages 
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messageList = [“Score could be better; attend to friendly damage, 


“Score could be better; attend to friendly damage and try other 

routes,” 

“Score is looking good; go ahead and make decisions quickly,” 

“Score is looking good; stay with your strategy”] 

CAPTTIM = ‘ ‘ 

gainList = [] #Capture absolute regret 
medGain = [] #Capture median regret values 

subName = strftime(“%Y %b %d %a %H %M Mil MultiArmBandit.csv,” 
localtimeO) # time as file name 

root = Tkinter.Tk( ) 

# Player parameters 

X = Player(2000,0) # instantiate player object onhand,plays 

irouteUse= {} 

irouteUse[1]= 0 

irouteUse[2]= 0 

irouteUse[3]= 0 

irouteUse[4]= 0 

game = Control(irouteUse) 

time_to_decide = DecideTime() 

click = Click(0,0) 

v_onhand = DoubleVar() # instantiate running total onhand 
v_onhand.set(x.oh) # running total from all machines 
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□color = ‘black’ 


v_gain = DoubleVar() 

v_gain.set(0) # running total from all machines 
vjoss = DoubleVarO 

v_loss.set(0) # running total from all machines 
v_plays = IntVarO 
#v_plays.set(game.playlimit-x.p) 
v_plays.set(x.p) 

v_bdt1 = DoubleVarO # last payoff value for machine 1 
v_bdt2 = DoubleVarO 
v_bdt3 = DoubleVarO 
v_bdt4 = DoubleVarO 

v_gain1 = DoubleVarO 
v_gain2 = DoubleVarO 
v_gain3 = DoubleVarO 
v_gain4 = DoubleVarO 

# Bandit parameters held in a dictionary 
igain= {} 

igain[1]= 100 #bandit 1: n1,p1,n2,p2 
igain[2]= 100 
igain[3]= 50 
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igain[4]= 50 


150, 


iloss= {} 

iloss[1]= [-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,- 

0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0, 

-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0, 

-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0, 

-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350, 

-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0, 

-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150, 

0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,- 


150,0,0 


150,0,0 


-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150, 

0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0, 

-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0, 

-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0, 

-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350, 

-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,-150,0,0, 

-350,-250,0,-200,0,-300,0,-150, 

0,0,-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250,0,-200,0,-300,0,- 

-350,-250,0,-200,0,-300,0,-150,0,0,-350,-250] 
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iloss[2]= 

1250,0,0,0,0,0,0,0,0,0, 


[-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,- 


-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0, 

-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0,-1250,0,0,0,0,0,0,0,0,0] 


iloss[3]= [-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,- 
50,0,-50,0, 


50,0,0, 


50, 


50,0, 


-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,- 

-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,- 

0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,- 

0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0, 
-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0, 
-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50, 
-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0, 
-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0, 
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0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0, 
-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0, 
-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50,-50,0,-50,0,-50,0,-50,0,0,-50, 
-50,-50,0,-50,0,-50,0,-50,0] 


iloss[4]= 

250,0,0,0,0,0,0,0,0,0, 


[-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,- 


-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 

-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 

-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 

-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 

-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 

-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 

-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 

-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0, 

-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0,-250,0,0,0,0,0,0,0,0,0] 


print ‘Deck A is’,len(iloss[1]),’cards long.’ 
print ‘Deck B is’,len(iloss[2]),’cards long.’ 
print ‘Deck C is’,len(iloss[3]),’cards long.’ 


print ‘Deck D is’,len(iloss[4]),’cards long.’ 


ipayoff= {} 
ipayoff[1]= 0 
ipayoff[2]= 0 
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ipayoff[3]= 0 
ipayoff[4]= 0 


b=Bandit(igain,iloss,ipayoff) 

v_bdt1.set(b.po[1]) 

v_bdt2.set(b.po[2]) 

v_bdt3.set(b.po[3]) 

v_bdt4.set(b.po[4]) 

v_gain1.set(0) 

v_gain2.set(0) 

v_gain3.set(0) 

v_gain4.set(0) 

def refresh(): 
app.mainloopO 

def bdt1(): 
machine= 1 
gain=b.gain[machine] 
loss=b.loss[machine] 
getGain(gain,loss,machine) 
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def bdt2(): 
machine= 2 
gain=b.gain[machine] 
loss=b.loss[machine] 
getGain(gain,loss,machine) 

def bdt3(): 
machine= 3 
gain=b.gain[machine] 
loss=b.loss[machine] 
getGain(gain,loss,machine) 

def bdt4(): 
machine= 4 
gain=b.gain[machine] 
loss=b.loss[machine] 
getGain(gain,loss,machine) 

#Display a modal pop-up info box with the supplied message string 

def displayDialog(message): 
tkMessageBox.showinfo(“Guidance,” message) 

def getGain(gain,loss,mach): 

CAPTTIM = “ 
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cogState = “ 
gainP = gain 
lossP = -1*loss.pop() 
gain = gainP - lossP 

latent = time.time() - time_to_decide.start 
time_to_decide.start = time.time() 
game.route[mach] += 1 

b.po[mach] = b.po[mach] + gainP + lossP # update earning by machine 
x.oh = x.oh + gain # update earnings total, subtracting any cost to play 
if x.oh < 0: 

□color = ‘red’ 
else; 

□color = ‘black’ 
v_onhand.set(x.oh) 

x.p = x.p + 1 # update times game played 

dt = datetime.now() 
machTime= dt.time() 

unixTime = calendar.timegm(dt.utctimetuple()) 


if x.p <= 50: 

sel^ata=[x.p, mach, gainP,lossP,x.oh, click.x, click.y, 
latent,unixTime,machTime, CAPTTIM] 

run^ata.append(sel^ata) # store data 
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# Cognitive State # 


if x.p<=2: 

avgLatencyTime = latent 

avglatentList.append(latent) #Store Average Latency Time 
latentList.append(latent) 
latentList50.append(latent) 
else: 

if gain >= 0: ##only append the latency time to the list if the choice is 

not ‘bad’ 

latentList.append(latent) 

latentList50.append(latent) 

#Compute EWMA Latency from Nesbitt Understanding Optimal 
Decision Making 

avgLatencyTime = latencyLambda*latentList[len(latentList)-1] + (1- 
latencyLambda)*avglatentList[len(avglatentList)-2] 

#Store Average Latency Time of these GOOD decisions 

avglatentList.append(avgLatencyTime) 

else: 

avgLatencyTime = latencyLambda*latentList[len(latentList)-1] + (1- 
latencyLambda)*avglatentList[len(avglatentList)-2] 

#Still computing the average latency time, just not appending it to the 
list when subject takes a hit during first 50 trials 
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print “bad choice, latency not added to avgLatentList’ 


baseLineLatency = np.mean(latentList50) 


else: 

latentList.append(latent) #still have to capture all the raw times? We 
should be using EWMA for Exp/Exp 

baseLineLatency = np.mean(latentList50) ##there’s nothing added to 
latentList after trial 50, so baseLineLatency stays the same 

STDofBaseLineLatency = np.std(latentList50) #Compute the standard 
deviation of the latency time 

avgLatencyTime = latencyLambda*latentList[len(latentList)-1] + (1- 
latencyLambda)*avglatentList[len(avglatentList)-2] 

avglatentList.append(avgLatencyTime) 

print “SD of baseline,” STDofBaseLineLatency 

#get the mean of the last 10 latency values from the overall list 

LastlOAvg Latencies = avglatentList[len(avglatentList)- 

10:len(avglatentList)] 

STDLastIO = np.std(LastlOAvgLatencies) 

print “STD of Last 10 trials: ,” STDLastIO 


if STDLastIO <= 1.5*STDofBaseLineLatency: 
cogState = ‘Exploit’ 

elif STDLastIO > 1.5*STDofBaseLineLatency: 
cogState = ‘Explore’ 
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print “CogState is: %s” %cogState 


# Regret # 


gainList.append(gain) 
regret = -gain 

for check in range(50,game.playlimit,10): 

checkLastIO = gainList[len(gainList)-10;len(gainList)] #take the most 
recent 10 gain values from the overall list 

checkLastlS = gainList[len(gainList)-15:len(gainList)] #take the most 
recent 15 gain values from the overall list 

averageLastIO = np.average(checkLastlO) 

medianLastIO = np.median(checkLastlO) #the median of the above list 
to compare against most recent trial 

medianLast15 = np.median(checkLast15) 

averageLast15 = np.average(checkLast15) 


if x.p == check: 

print “The last 10 gains: checkLastIO 
print “median of last 10 trials: medianLastIO 
print “average of last 10 trials averageLastIO 
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if averageLastIO < medianLastIO and cogState == ‘Explore’: 

CAPTTIM = “YELLOW” #the inequality above “ave > median” is the 
definintion of gain (note line 381 that regret is opposite of gain) 

displayDialog(messageList[0]) 

elif averageLastIO < medianLastIO and cogState == ‘Exploit’: 
CAPTTIM = “RED” 
displayDialog(messageList[1]) 

elif averageLastIO >= medianLastIO and cogState == ‘Explore’: 
CAPTTIM = “ORANGE” 
displayDialog(messageList[2]) 

elif averageLastIO >= medianLastIO and cogState == ‘Exploit’: 
CAPTTIM = “GREEN” 
displayDialog(messageList[3]) 
print “CAPTTIM,” CAPTTIM 

## Compute CAPTTIM for every trial and append it to the selection data 
for later evaluation of proportion of time in R/Y/O/G 

if x.p>50: 

if averageLastIO < medianLastIO and cogState == ‘Explore’: 

CAPTTIM = “YELLOW” 

elif averageLastIO < medianLastIO and cogState == ‘Exploit’: 

CAPTTIM = “RED” 

elif averageLastIO >= medianLastIO and cogState == ‘Explore’: 
CAPTTIM = “ORANGE” 
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elif averageLastIO >= medianLastIO and cogState == ‘Exploit’: 
CAPTTIM = “GREEN” 

selData=[x.p, mach, gainP,lossP,x.oh, click.x, 
latent,unixTime,machTime,cogState,CAPTTIM] 

runData.append(selData) 

v_gain.set(0) 
vJoss.set(O) 
if mach == 1: 
v_bdt1 .set(b.po[mach]) 
v_gain1 .set(gain) 
if mach == 2: 
v_bdt2.set(b.po[mach]) 
v_gain2.set(gain) 
if mach == 3: 
v_bdt3.set(b.po[mach]) 
v_gain3.set(gain) 
if mach == 4: 
v_bdt4.set(b.po[mach]) 
v_gain4.set(gain) 

v_plays.set(x.p) 

v_gain.set(gainP) 

vJoss.set(-1*lossP) 


click.y. 
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if x.p >= game.playlimit: 
print (“\n\nPLAY LIMIT MET\n\n”) 
WriteToFile(runData,subName) 
print (‘shut down’) 
root.quit()\ 

if_name_== “_main_ 

app = Application(master=root) 

app.master.title(“Route Selection and Battle Damage Tool”) 

app.master.minsize(1000,400) 

root.bind(‘‘<1>,” callback) 

app.mainloopO 

root.destroyO 
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APPENDIX C. DEMOGRAPHIC SURVEY 


Convoy Task 
Demographic Survey 

Subject number: Date: 

1- Age:_ 

2. Gender: Female_ Male_ 

3. Preferred hand for writing: Left:_ Right:_ 

4. Are you currently serving in the Armed Forces: Yes No 

a. Which branch:_ 

b. Years of service:_ 

c. Highest rank:_ 

d. Have you deployed to a combat zone [receipt of Imminent Danger Pay)? 

No [skip to e.) Yes [i - ill below) 

i. Date of return from latest deployment_ 

ii. Role during deployment [e.g. Surface Warfare Officer, Engineer 

Company Commander, Division Logistics Officer, AH-IW section lead, 
etc.)_ 

ill. Responsibilities [Route clearance. Fires planner, etc): 

e. If no combat deployment, what was your billet/rate immediately prior to NPS? 

f. If no combat deployment, what were your responsibilities immediately prior to 

NPS?_ 
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APPENDIX D. POST TASK SURVEY 


Convoy Task 
Post Task Survey 

Subject number: Date: 

1. During the Convoy Task how did you determine which route to select? 

2. If you used a particular strategy what was it? 

a. Did your strategy change during the task? 

b. If yes, at about which point (e.g. right away, about halfway, toward the 
end] did you change your approach? 

c. If yes, what caused you to change your approach? 


3. Rank the routes overall from safest (1] to most dangerous (4]: 


Left 

Center Left 

Center Right 

Right 
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